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Quantum impurity models describe an atom or molecule embedded in a host material 
with which it can exchange electrons. They are basic to nanoscience as representations 
of quantum dots and molecular conductors and play an increasingly important role 
in the theory of "correlated electron" materials as auxiliary problems whose solution 
gives the "dynamical mean field" approximation to the self energy and local correlation 
functions. These applications require a method of solution which provides access to both 
high and low energy scales and is efli'ective for wide classes of physically realistic models. 
The continuous-time quantum Monte Carlo algorithms reviewed in this article meet this 
challenge. We present derivations and descriptions of the algorithms in enough detail 
to allow other workers to write their own implementations, discuss the strengths and 
weaknesses of the methods, summarize the problems to which the new methods have 
been successfully applied and outline prospects for future applications. 
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I. INTRODUCTION 
A. Overview 

This article aims to provide a comprehensive overview 
of recent developments which have made continuous- 
time quantum Monte Carlo (CT-QMC) approaches the 
method of choice for the solution of broad classes of quan- 
tum impurity models. We present derivations and de- 
scriptions of the algorithms in enough detail to allow 
other workers to write their own codes, and give a gen- 
eral introduction to diagrammatic Monte Carlo methods 
on which these algorithms are based. We discuss the 
strengths and weaknesses of the methods, and their range 
of applicability. We summarize the problems to which the 
new methods have been successfully applied, and outline 
prospects for future applications. We hope that read- 
ers will come away from the review with an appreciation 
of the power and flexibility of the techniques and with 
the knowledge needed to apply them to new generations 
of problems in nanoscience, correlated electron physics, 
nonequilibrium systems and other But before en- 

tering into specifics it is worth asking: 'what are quan- 
tum impurity models?' and also 'why study them with 
continuous time -methods?' 



B. Quantum impurity models: definitions and examples 

Quantum impurity models were introduced to describe 
the properties of a nominally magnetic transition metal 
ion embedded in a non-magnetic host metal. A mag- 
netic transition metal atom such as Fe and Co has a 
partly filled d shell, and the intra-d Coulomb interac- 
tions act to organize the electrons in the d-shell into a 
high-spin local moment configuration. Hopping from the 
d shell to the metal or vice versa favors non-magnetic 
configurations and thus competes with the local in terac- 
tions. In 1961 P. W. Anderson (I Anderson . 1961), fo l- 
lowing important earlier work of iFriedelT i 1951 . 1956[) . 
wrote down a mathematical model (now referred to as 
the Anderson Impurity Model) which encodes this com- 
petition. Anderson's concept has proven enormously 
fruitful, with implications extending far beyond its orig- 
inal context of impurity magnetism. Quantum impu- 
rity models are basic to nanoscience as rep resentations of 



quan tum dots and molecular conductors (jHanson et al. 



20071 ) and have been used to understand the adsorp- 
tion of atoms onto surfaces (Brako and Newnsl . 1981 



Langreth and Nordlandeij . 119911 ). They are of theoreti- 
cal interest a s solvab l e examples of nontrivial quantum 
field theories (jAfHeckl . l2008t IWilsonl . Il975[) and in recent 
years have played an increasingly important role in con- 
densed matter physics as auxiliary problems whose solu- 
tion gives the "dynamical mean field" (DMFT) approx- 
imation to the properties of correlated electron matcri- 
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als such as high temperature copper-oxide and pnictide 



superconductors (Georges et al . 1996 : Held et al . 20061 



Kothar et aTl . l2006l) . 



A quantum impurity model (see e.g. ( Mahan . 2000l )) 
may be represented as a Hamiltonian with three basic 
terms: i?ioc which describes the "impurity" : a system 
with a finite (typically small) number of degrees of free- 
dom, i?bath which describes the noninteracting but infi- 
nite (continuous spectrum) system to which the impurity 
is coupled, and ffhyb which gives the coupling between 
the impurity and bath. Thus 



Qi 



bath 



Hi 



hyb- 



(1) 



The physics represented by Hqi is in general nontrivial 
because [i?ioc , -ffhyb] 7^ (in physical terms, coupling to 
the bath mixes the impurity eigenstates) . 

In the situation of primary physical interest Hioc may 
be represented in terms of a set of single-particle fermion 
states labeled by quantum numbers a = 1, . . . , N (includ- 
ing both spatial and spin degrees of freedom) and created 
by operators d\ as 



ab 

HL - E I^'^'^dldldrds 
pqrs 



(2) 
(3) 

(4) 



The ah components of the matrix E describe the bare 
level structure, / parametrizes electron electron inter- 
actions and the ellipsis denotes terms with 6 or more 
fermion operators. 

^^bath may be thought of as describing bands of itiner- 
ant electrons, each labeled by a one-dimensional momen- 
tum coordinate k or band energy and an index (spin 
and orbital) a. One usually writes 



i?b 



ath 



ka'^ka- 



(5) 



ka 



The most commonly used form of the mixing term is 
characterized by a hybridization matrix V 



i?hyb=E^^"''4a4+H.C., 



kab 



although exchange couplings of the form 



^hyb ~ ■Jk,k2(=k,a(^k2bdl,dd 

kik2Cihcd 



(6) 



(7) 



also arise, most famously in the "Kondo problem" of a 
spin exchange -coupled to a bath of conduction electrons 
(lKondol[l96l . 

Coupling of impurity models to oscillators (represent- 
ing for example phonons in a solid) has also been consid- 
ered. A discussion in the CT-QMC context is presented 
in Section rvni 



It is sometimes convenient to represent the parti- 
tion function Z of the impurity model as an im agi- 
nary time path integral (jNegele and Orlandl . [l988h . In 
this representation it is easy to formally eliminate the 
bath degrees of freedom (a technique pioneered by 
Fevnman and Vernon (|l963h '). obtaining an action which 
for Hamiltonians involving a hybridization of the form of 
Eq. © is 



Z 



V[d\d\ 



S=J2 //' dTdT'di{r)\{dr+E^^'')5{r-r') 



(8) 
(9) 



+ A^\r - r')]d,{r') + f drHUr)- 
^ Jo 

In this formulation the hybridization function 

kOL 



(10) 



compactly encapsulates those aspects of the bath that 
are relevant to the impurity model physics. It will play 
a crucial role in our subsequent discussions. It is also 
often useful to define the noninteracting impurity model 
Green's function via 



^" = -(9r+E + A)-i. 



(11) 



The paradigmatic quantum impurity model is 
the single-impu rity single-orbital Anderson model 
(jAndersonl Il96l[) . In this model, H\oc describes a sin- 



gle orbital, so the label a is spin up or down, E""^ is (in 
the absence of magnetic fields) just a level energy eg, and 
the interaction term collapses to C/n-j-nj,. Thus 

-ffAiM = E ^odidcr + U n^ni (12) 

(J 

+ E i^'^'^Lda + H.c) + E £kci,ck.. 



Impurity models with more degrees of freedom are as- 
suming increasing importance. More degrees of freedom 
means a richer variety of physical phenomena, implying a 
more complicated structure for the interactions. For ex- 
ample, in transition metal oxide materials with partially 
filled d shells or in compounds involving rare earth or ac- 
tinide atoms with partially filled / shells, the interactions 
express not only the energy cost of multiply occupying 
the atom but also the Hund's rule physics that states 
of maximal spin and orbital angular momentum are pre- 
ferred. Thus the interaction Hamiltonian describing the 
energetics of different configurations of electrons in the 
d orbitals which play an important role in the physics of 
transition metal oxides with cubic perovskite structures 
i s normally written in the "Slater-Kanamori (SK)" form 
( Imada et al . 1998t Mizokawa and Fujimori . 1995 ): 
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C. State of the art prior to continuous time QMC 



H(oc = ^SK = U'^ na^na\. + {U - 2 J) ^ Ua^Ub^ 
a a^tb 

+ ([/ - 3 J) ^ UaaTlba 
a>b.(7 

-JYI {'^t'^l'^btdbi + dl^dl^db^dai'j ■ (13) 

a^b 

In nanoscience applications the impurity typically rep- 
resents the highest occupied (HOMO) and lowest unoc- 
cupied (LUMO) molecular orbitals and the interactions 
are computed from Coulomb matrix elements involving 
these orbitals. In "cluster" dynamical mean field applica- 
tions the "impurity" is thought of as a (typically small) 
number of sites with the E""^ representing an intersite 
hopping Hamiltonian, thus in a two-site approximation 
to the Hubbard model 



^^loc = Hcl=J2 ^0 [dLdla + dlj2a) (14) 

(T 

+ ^ t (d\^d2a + d\^di„ j -f U (ni-i-ni; -|- n2-\n2\) ■ 



Solving the quantum impurity model means computing 
the correlation functions of the d operators. Of these the 
most important is the d Green function [Tr denotes time- 
ordering). 

Gfir) = (T.d,(r)4(0)) . (15) 

In the absence of interactions, G'^{iujn) = Q^'°'^{i^n) = 
[{iujn — E — A)~^]a(,. The effect of interactions may be 
parametrized by the self energy S(ia;„) = {Q^) — G~^. 

Solving the quantum impurity model is conceptually 
and algorithmically challenging. As Eq. ([9]) demon- 
strates, a quantum impurity model is a quantum field 
theory in space -t- 1 time dimension. While -I- 1 di- 
mensional quantum field theories are easier to solve than 
higher dimensional ones, they are still (in the general 
case) nontrivial. Only in a few cases are exact solutions 
known, and while in many more cases the form of the 
"universal" low energy behavior has been determined, 
the dynamical mean field and nanoscience applications 
require information about behavior beyond the universal 
limit, as well as quantitative information about the pa- 
rameters describing the universal limit. A further com- 
plication is that impurity models typically involve sev- 
eral energy scales, including an interaction scale, often 
high, a hybridization scale, typically intermediate, and 
one or more dynamically generated energy scales, which 
in many cases are very low relative to the basic interac- 
tion and hybridization scales. A robust method which 
works for general models over a range of energy scales is 
required. 



Quantum impurity models have been of long- 
standing interest and a wide range of approximate 
techniques have been developed to solve them, in- 



(Yosida and Yamada 




I975I) and in flavor degener- 


acv (Colemanl 1984 




^ead and Newnd. 1983), pertur- 


bative (jAnderson et al 


. 1970t ISolvom and Zawadoswki. 


1974) and functional 


(Hedden et al\. 20041) renormal- 



i zation group, as well as "X-oper a tor" techniques 
(iGunnarsson and Schonhammer . 1983 : Hubbard . 1964 : 



Keiter and Kimball 11970^ and different formulations of 



auxihary ("slave") particle methods (AbrikosovL 1965 



Barned.[l976tlColemanl.ll984tlFlorens and Georgesll2004 : 



Read and Newnjl983 ). An important subclass of ana- 
lytical methods is based on the resummation to all or- 
ders of a particular subset of diagrams. In the impurity 
model context the most importa nt of the s e are the non- 



crossing approx imation (NCA) (iBickersl. 119871) and its 



2001 



Pruschke and Grewe 



gener alizations (jHaule et al. 

19891) . The terminology refers to the structure of dia- 
grams in an expansion in the hybridization. In this ex- 
pansion contractions of bath operators are represented 
as lines. If one uses a time-ordered perturbation theory 
diagrams may be classified by the number of times that 
lead operator lines cross and all diagrams with zero or one 
crossing may be analytically summed by solving an inte- 
gral equation. While uncontrolled, these approximations 
capture many aspects of the physics of impurity models 
and can be formulated on the real frequency axis. They 
have therefore been used as inexpensive solvers for im- 
purity models an d to study nonequilibrium phenomena 



19941 ). Very re- 



in nano-contacts (jWingreen and Meii 
cently, techniques closely related to those described here 
have been used to formulate a numerically exact solu- 



tion b ased on an expansion around the NCA (jGull et al. 
2OIOII . It is found that the approximations are accurate 



in the Mott insulating phase but do not capture the im- 
portant diagrams in the metallic phase of the Anderson 
impurity model. Powerful field-theoretical and Bethe- 
ansatz-based analytical methods have been developed to 
classify and compute exactly the un i versal low energy be- 
havior (jAffleckl . I2OO8I : lAndrei et aZI . Il983l : IWilsonl . Il975l ) 
of broad classes of quantum impurity models. However, 
the dynamical mean field and nanoscience applications 
require an approach that works for wide classes of physi- 
cally relevant impurity models and gives access to physics 
beyond the universal limit. Thus, while analytical meth- 
ods provide very valuable insights, they do not provide 
the comprehensive solutions, valid over a wide range of 
frequencies, that are needed for modern applications. 



Startin g frorn work o flWilsonI (119751) an d subsequently 
of lWhitd (|l992t ). (see (jSchollwockl l2005h for a review), 
an important set of numerical methods has been de- 
veloped based on intelligently chosen truncations of the 
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Hilbcrt space of the many-body problem in question. The 
"numerical renormalization group" (NRG) methods are 
based on iterative diagonalization using a logarithmic dis- 
cretization of the energy spectr um of the lead stat es and 
are reviewed for example in ( Bulla et al . 20081 ) while 
the "density matrix renormalization group" (DMRG) 
techniques involve an isolation of the relevant low-lying 
states. These methods are complementary to the meth- 
ods discussed here: CT-QMC methods are most natu- 
rally formulated in imaginary time and very efficiently 
handle a wide range of energy scales and relatively gen- 
eral classes of models, but require analytical continuation 
to obtain real-time information and have difficulty resolv- 
ing subtle low energy features such as the fine structure 
of quantum criticality. On the other hand, the NRG and 
DMRG methods can be formulated directly on the real 
frequency axis or in real time and are particularly power- 
ful in resolving ground states and low-lying levels but en- 
counter difficulties in providing information over a wide 
range of frequencies and the difficulties increase rapidly 
as one moves beyond the siniple A i iderson/Hubbard 
mode ls. Both D MRG (iHallbertd. 120061: iNishimoto et all 
20061 ) and NRG (|Bulla et all . l2008l ) methods have been 
implemented as "solvers" for the quantum impurity mod- 
els of dynamical mean field theory. But except for the 
single-orbital Anderson model, where NRG methods have 
proven to be useful, especially in situations where a pre- 
cise understanding of the very low energy behavior is 
crucial, NRG and DMRG solvers are not in widespread 
and general use in the DMFT community. 

A more widely applied class of techniques is based 
on the "exact diagonalization" (ED) idea introduced 
in the early days o f dyna mical mean field theory by 
Caffarel and KrauthI (jl994l ). These authors approxi- 
mated the continuum of bath energies and values of the 
hybridization by a small number of variationally cho- 
sen eigenstates and hybridization functions. Hqj then 
becomes a finite system, which is exactly diagonalized, 
leading to a Gd characterized by a delta function spec- 
trum. The cost scales exponentially with the number 
of sites considered. The largest systems which are typ- 
ically studied contain on the order of 15 sites with one 
non-degenerate orbital on each site. Thus, in the single- 
impurity Anderson model, Eq. (fT2|). the continuum of 
bath states Cka niay be approximated by 7 or 8 (x2 for 
spin) orbitals while for say a three orbital model only 
two or three bath orbitals per impurity state can be 
accommodated. With the development of more mod- 
ern algorithms and computers, enough bath sites can be 
included that for the single-orbital Anderson impurity 
model the temperature dependence can be computed, 
the convergence of result s with bath size can be stud- 
ied (jCapone et ali . |2007[ ). and s ystematic comp a risons 



2006al: iLiebsch and Tonei |2009|) and single-inipurity , 



to other methods c an be made ( Comanac et al . 20081 



Werner et al 



(Civelli et al 



■20061). Recently, results on small clusters 
2005HKancharla a/.l . [2008tlKvung et ali 



multiorbital models ( Liebschl . l2005t iLiebsch et all 120081 ) 



have also been obtained, although here the number of 
bath sites per orbital is limited and the convergence 
with bath site num ber cannot yet be addressed rigorously 



( Koch et am2008D . 



Quantum Monte Carlo techniques provide a general 
method for solving quantum field theories, and prior to 
the development of CT-QMC methods the principal im- 
purity s olver was the Hirsch-F ye quantum Monte Carlo 
method ( Hirsch and Fve . 19861 ). This method is based on 
writing an imaginary-time functional integral, discretiz- 
ing the interval [0, (3) into M equally spaced "time-slices" 
At = (3/M and then on each time-slice i applying a dis- 
crete Hubbard- Stratonovich transformation which for the 
single-orbital Anderson model is 



1 



E 



= ±1 



A = arcosh 



exp I -AtU 



(16) 



(17) 



For a fixed choice of Ising variables {si} the problem thus 
becomes a noninteracting fermion model in a time depen- 
dent z-oriented magnetic field h(Ti) = Si which may be 
formally solved, so one is left with the problem of sam- 
pling the trace over the 2*^ dimensional space of the Si. 

The Hirsch-Fye method was for almost two decades the 
method of choice, but ultimately three difficulties limit its 
power. The first is that it requires an equally spaced time 
disc retization. (A hnear-i n-/3 method for impurity mod- 
els ( Khatami et al . 2010[) . while fast, similarly requires 
discretization both of the bath and the imaginary time 
axis.) The second is that at large interactions and low 
temperatures equilibration may become an issue. While 
techn iques have been developed to ameliorate th ese prob- 
lems (|Bhimej . [200llGorelik and Bliime j . l2009l) and new 



upda t e techniques have b een proposed (jAlvarez et 



20081 : iNukala et all . |2009[) . the difficulties of managing 
the discretization and equilibration issues within Hirsch- 
Fye are real. It appears that the CT-QMC methods dis- 
cussed here are now preferred by most practitioners. The 
third, and most fundamental difficulty is that for inter- 
actions other than the simple one-orbital Hubbard model 
the Hubbard-Stratonovich fields required to decouple the 
interactions proliferate and may have to be chosen com- 
plex so sampling the space of auxiliary fields becomes 
prohibitively difficult (jSakai et aLl . l2006bl ). 



D. Why continuous time? 

Imaginary-time path integral representations of quan- 
tum problems suc h as Eq. (|9l) are mathem atically de- 
fined (see, e.g. Negele and OrlandL 1988 and refer- 
ences therein) in terms of the result of a limiting pro- 
cess in which one rewrites the partition function, Z = 



6 



exp(— /3ff), of a system described by a Hamiltonian H at 
temperature T by defining Ar ~ (3/N, tj = jAr 



as 



=,-(ri-To)H 



(18) 



The path, integral is defined by inserting complete sets 
of states between every pair of exponentials and then 
taking the limit At — > 0. This mat hematical def ini- 
tion motivates a numerical approach ( Suzukil . Il976 ) in 
which one approximates the path integral by (i) retain- 
ing a non-zero Ar and (ii) using a Monte Carlo method 
to estimate the sums over all intermediate states. The 
exact partition function is recovered after the twin steps 
of converging the Monte Carlo and extrapolating the re- 
sults to At = 0. While clever and efficient methods 



(for e xample, the Hirsch-Fye procedure (jHirsch and Fve . 
19861 ) mentioned in the previous subsection) have been 
devised for performing the Monte Carlo, the time step 
extrapolation remains an issue. The difficulties are par- 
ticularly severe for the quantum impurity problems of 
interest here because the basic object in the theory is the 
Green function, which drops rapidly as t is increased 
from and has discontinuous derivatives at t = 0,/3 
wh ich need to b e corr ectly evaluated (see e.g. Fig. 2 
m (jWerner et all l2006l )). The discretization errors are 



large, and a very small At and a precise extrapolation 
to At = are required to obtain accurate results. How- 
ever, the low energy behavior of interest is carried by 
times T ^ (3/2, so that simulations on a homogeneous 
grid require many points. Methods which do not involve 
an explicit time discretization) would therefore appear to 
be advantageous. 

The basic idea behind all of the continuous time meth- 
ods discussed in this review is to avoid the time dis- 
cretization entirely by sampling the terms in a dia- 
grammatic expansion, instead of sampling the configu- 
rations in a complete set of states. One of the first 
i mportant rnethod s to d o this is Handscomb's method 
(|Handscombl Il962[ Il964l) . This method and its general- 
i zation, the stochastic series ex pansion (SSE) algorithm 
([Sandvik and Kurkiiarvil Il99l[ ) are based on a Taylor ex- 
pansion of the partition function in powers of /3H and 
have been successful for quantum magnets. However, 
they require that the spectrum of the Hamiltonian is 
bounded from above so applications to boson problems 
require a truncation of the Hilbert space while applica- 
tions to fermion problems are limited by a bad sign prob- 
lem. 

The c ontinuous time method s in u se now stem froru 
work of ( Prokof 'ev et all Il996[) and (jBeard and Wiesel . 
19961 ). who showed that simulations of bosonic lattice 
models can be implemented simply and efficiently in 
continuous-time by a stochastic sampling of a diagram- 
matic perturbation theory for the partition function. 
The general scheme for treating diagrams with continu- 
ous variables of arbitrary nature ~ diagrammatic Monte 



Carlo - is formu l ated i n (jProkofev and Svistunovl . 11998 : 
Prokof'ev et all Il998f ). In these methods the system- 



atic errors associated with time discretization and the 
Suzuki- Trotter decomposition were eliminated. The gain 
in computational efficiency is so large that the problem of 
simulating unfrustrated bosonic lattice models can now 
be considered as solved, although special cases, for ex- 
ample bosons coupled to a gauge field (rotating atomic 
gases, charged bosons in a magnetic field) remain chal- 
lenging. 

The success of CT-QMC methods for bosons stimu- 
lated efforts to adapt th e technique to fermionic problems 
(|Rombouts et all Il999l) . However, in contrast to stan- 
dard (unfrustrated) bosonic systems, where diagrams all 
have the same signs, in fermionic models individual di- 
agrams may have positive or negative signs, so that the 
sampling of individual diagrams suffers from a severe sign 
problem. This sign problem may be reduced by combin- 
ing classes of diagrams analytically into determinants. 
Unfortunately, Rombouts and collaborators found that 
a prohibitively severe sign problem remained in the pa- 
rameter regimes relevant to strong correlation physics. 
This, and the fact tha t the lattice algorithm given in 



( Rombouts et al . Il999^ was restricted to density-density 
interactions, caused many researchers to abandon the 
approach - except for the special case of sign-problem- 
free models with an attractive interaction, where CT- 
QMC methods have successfully been used to investi- 
gate the BEC-to-BCS crossov er in ultracold atomic gases 
(jBurovski etai\ . I2008L [2006al lbl) . 

The increasing importance of impurity models has mo- 
tivated a reexamination of CT-QMC methods. Impu- 
rity models turn out to have a much less severe sign 
problem than the full lattice problem (indeed in some 
cases the sign problem is absent). The reduction in 
severity of the sign problem has allowed the develop- 
ment of flexible and powerful continuous-time quan- 
tum Monte Carlo im purity solvers, first in a weak- 
coupling forniulation ([Rubtsov and Lichtensteinl . 12004 : 
Rubtsov et al . l2005l) . soon thereafter in a complemen- 



tary hybridization expansion formulation (jWerner et al. 
20061 ) , and more recent ly in an auxiliary field formula- 



tion (|Gu11 et all l2008a|) . These methods have quickly 



been extended in many directions and applied to nu- 
merous dynamical mean field studies of model Hamil- 
tonians. They enabl ed accurate si mulati ons of the 



Kondo lattice model (jOtsuki et all l2009al ). the first 



quantitative studies of multi-orbital models with real- 
i stic rotatio n ally i n yariaiit (noii - diagonal) inter a ctions 



( Chan et aZ.I. l2009t iHaui^. 120071: iRubtsov et all 12005 



Werner et aU 120081: fwerner and Millid . I2006ll2007ct) and 



allowed much more efficient simulations of the multi-site 
clusters needed to study spatial co rrelation effect s within 
dynamical mean f ield th e orv ([Perrero et~. , I2OO94 
GuW et all l2009l. l2008"bl: iHaule and Kotliaii l2007b : 



Mikelsons et all l2009at IPark et all l2008at ISordi et al . 
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20101: IWerner et aZI . l2009al ). They have also en- 
abled more reahstic "LDA+DMFT" studies of materials 



( Marianetti et aZ.1 . 120071) 



Continuous time quantum Monte Carlo methods can 
also be used to efficiently compute four-point corre- 
lation functions, which are important for susceptibil- 
ities, phase boundaries, and in connection with re- 
cent ly developed ex tension s of dynamical mean field the- 
ory (Kusunosel 2006: Rub tsov et al . 2008 : Slezak et al. 



2009: Toschi et al. 



I2007D . The methods have been 
applied to nanoscience topics including the proper- 
ties of t ransition metal clusters on metal surfaces 
(iGorelovllioO? : ISavkin et aLl . [2005h . Previously inacces- 
sible physics questions such as the quasiparticle dynamics 
and thermal cros sovers in heavy fermion materials are 
being addressed (iHaule and Kotliar , 2007a : IPark et al. 



2008bl : IShim et al\ . 120071 ) and applications to questions 
motivated by experimen ts on fermions in optical lattices 
have begun to appear ( Dao et al . 2008t De Leo et al. . 
20081 ). Extensions to nonequilibrium problems are 



now under development (iMiihlbacher arid Rabanil 2008 : 
Schiro and Fabriziol . lioOOl : IWerner et a/.l . I20ldl2009bl ). 

While the new CT-QMC methods have been trans- 
formative, opening wide classes of problems to system- 
atic study, they have not solved the fermion sign prob- 
lem. As far as is known, sign problems are physical and 
unavoidab le, at least in itin e rant p hases with unpaired 
fermions (jTrover and Wiesd . 120051 ) and indeed set the 
ultimate limits on the problems and parameter regimes 
which can be studied by the continuous time methods 
discussed here. Further discussion of sign problems will 
be given in Sec. III.Dl and in the context of the discussion 
of specific algorithms. 



II. DIAGRAMMATIC MONTE CARLO IN CONTINUOUS 
TIME 

A. Basic ideas 

The basic idea of the CT-QMC methods is very simple. 
One begins from a Hamiltonian H = Ha + Hi, which is 
split into two parts labeled by a and 6, writes the parti- 
tion function Z — e^^^ in the interaction representation 
with respect to Ha and expands in powers of iJt,, thus 
(Tr is the time ordering operator) 



Z =Tr Tre^f^"-^ exp 



I, "'0 



dTi . . 



drHbir) 



dTh 



X TT[e-^"^Hb{Tk)Hb{Tk-i)...Hb{n)]. (19) 

The trace evaluates to a number and diagrammatic 
Monte Carlo methods ( Prokof 'ev and Svistunovl . 119981 ) 



enable a sampling over all orders k, all topologies of the 
paths/diagrams and all times ti, • • • , in the same cal- 
culation. Because the method is formulated in continu- 
ous time from the beginning, time discretization errors 
do not have to be controlled and the simulation can be 
arranged to ensure that the method focuses attention on 
the time regions which are most important to the process 
under study. Provided the spectrum of the perturbation 
term is bounded from above the contributions of very 
large orders are exponentially suppressed by the factor ^ 
originating from the expansion of an exponential. Thus 
the sampling process docs not run off to infinite order 
and no truncation of the diagram order is needed. (Note 
that for bosonic operators a perturbation in the inter- 
action would be divergent since the spectrum cannot be 
bounded from above u nless a cutoff in bosonic occu pation 
number is introduced (jitzvkson and Zubeil l2006l) . so an 
expansion in the hybridization is usually employed.) 

The method docs not rely on an auxiliary field decom- 
position although it may b e advantageously combined 
with one ( Gull et al . 2008al ). Further, the method does 
not rely on a particular partitioning into "interacting" 
and "noninteracting" parts; in principle the only require- 
ment is that one may decompose the Hamiltonian in such 
a way that the time evolution associated with Ha and 
the contractions of operators Hb may easily be evalu- 
ated. In practice, the sign associated with interchanges 
of fermion operators means that the expansion must be 
arranged such that terms differing only in the contrac- 
tions of fermion operators are combined; for example into 
determinants. 

In the impurity model context four types of expan- 
sion have been formulated, which we refer to as CT-HYB 
(iJfe = ifhyb, Eq. ®), CT-INT {Hb = Hl^, Eq. ©), CT- 
AUX {Hb = -ffiof, but with an additional auxiliary field 
decomposition) and CT-J (an expansion for Kondo-like 
problems with Hb = H'^'^^'^^" , Eq. ^). The advantage 
of the hybridization expansion is that arbitrarily com- 
plicated impurity interactions can easily be treated; the 
disadvantage is that because [^^hyb, ^^loc] 7^ at least 
one of the operators is non-diagonal so the expansion 
generically requires the manipulation of matrix blocks 
whose size grows exponentially with the number of im- 
purity orbitals. The present state of the art is that 5 
spin-degenerate orbitals can be treated. Various trunca- 
tion and approximation schemes provide limited access to 
larger problems but as the number of orbitals is increased 
the difficulties rapidly become insurmountable. 

CT-INT and CT-AUX are variations of an "interaction 
expansion" . They are sometimes referred to as "weak 
coupling" expansions, but this is a misnomer - the expan- 
sion is in powers of the interaction but is not (in princi- 
ple) restricted to small interactions. The series is always 
convergent for nonzero temperature and finite number 
of orbitals. In CT-INT and CT-AUX the scaling with 
number of impurity orbitals is not exponential, so much 
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larger systems can be treated. However, the methods are 
most suited to Hubbard-hke models with a single local 
density-density interaction. More complicated interac- 
tions typically require multiple expansions in the several 
vertices and if the interactions do not commute (as is 
the case for the components of the spin exchange) the 
difficulties increase. 

CT-J is an expansion organized for Kondo-like mod- 
els where the interaction vertex also creates particle-hole 
pairs in the conduction bands. It combines aspects of 
both the interaction and hybridization expansion. 

While all of the expansions are based on the same gen- 
eral idea, there are significant differences in the specifics 
of how the expansion is arranged, the measurements are 
done and the errors are controlled. We therefore devote 
a separate section to each expansion. In the remainder 
of this section we provide an overview of general aspects 
of continuous time Monte Carlo methods. 



B. Monte Carlo basics: sampling, errors, Markov chains 
and the Metropolis algorithm 

In this subsection we recall some basic results pertain- 
ing to the Monte Carlo evaluation of high dimensional 
integrals. For reader unfamihar wit h M onte Car l o, the 
books by iLandau and Binder] (|2000t ) and iKrauthI (|2006l ) 
give an extensive introduction to the technique. 

In the CT-QMC methods, as in many other classical or 
quantum many-body problems, one is faced with the is- 
sue of evaluating sums over phase spaces or configuration 
spaces which we denote generically by C . C is typically of 
a very high dimension, so Monte Carlo techniques arc the 
only practical methods of evaluation. A crucial quantity 
is the partition function, Z, which wc will write formally 
as an integral over configurations x e C with weight p(x): 



ability p(x)/Z and averaging the contributions ^(x^): 



Z 



dxp(x). 



(20) 



In a classical system x might be a point in phase space 
with a Boltzmann weight p(x) = exp{—/3E{x)), where 
i?(x) is the energy of the configuration x. In the quan- 
tum problems described here x will represent a particular 
term in a diagrammatic partition function expansion. 

The expectation value of a quantity A is given by the 
average, over the configuration space C with weight p, of 
a quantity ^(x): 



1 



(21) 



The auxiliary quantity A{x.) depends on the specific rep- 
resentation chosen in a particular algorithm. 

The average (PT|) can be estimated in a Monte Carlo 
procedure by selecting M configurations x^ with a prob- 



{A)p « {A)mc 



M 



(22) 



According to the central limit theorem, if the number 
of configurations is large enough the estimate (22) will be 
normally distributed around the exact value {A)p with 
variance 

VarA 



[AAY EE {{Amc - ApY) = 



M 



(23) 



It will sometimes be advantageous or necessary to sam- 
ple configurations x^ with a distribution /9(x) different 
from p(x). The expectation value {A)p in the ensemble 
then has to be reweighed: 

(A) = i^dx^(x)p(x) 

_/,dx^(x)g|p(x)_(AH), 



(24) 

/,dxHi^p(x) (-)p 

To estimate this expectation value one needs to sample 
both the numerator and denominator and collect aver- 
ages of A{Ki)p{Ki) / p{Ki) and p{xi)/p{xi). Care must be 
taken in estimating the statistical errors of such ratios, 
since cross-correlations will make naive error propagation 
unreliable. A jackknife or bootstrap procedure (see, e.g. 



(jVetterling et ami992l) ) is needed 



Integrals with general distributions such as Eqs. (pOj) . 
dMl) are best sampled by generating configurations using 
a Markov process. A Markov process is fully character- 
ized by a transition matrix Wxy specifying the probabil- 
ity to go from state x to state y in one step of the Markov 
process. Normalization (conservation of probabilities) re- 
quires W^xy = 1. Starting from an arbitrary distribu- 
tion the Markov process will converge exponentially to a 
stationary distribution p(x) if two conditions are satis- 
fied. 

• Ergodicity: It has to be possible to reach any config- 
uration X from any other configuration y in a finite 
number of Markov steps: for all x and y there ex- 
ists an integer N < oo such that for all n > N the 
probability (M^")xy ^ 0. 

• Balance: Stationarity implies that the distribution 
p(x) fulfills the balance condition 



/ dxp(x)VFxy = J5(y), 
Jc 



(25) 



that is p(x) is a left eigenvector of the transition 
matrix Wxy. A sufficient but not necessary condi- 
tion usually used instead of the balance condition 
is the detailed balance condition 



Wx 



w. 



which we will use below. 



pjy) 

p(x)^ 



(26) 
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The first, and still most widely used, algorithm that 
satisfies detailed balance is the Metropohs-Has tings al- 
gorithm ( Hastings . 19701 : Metropolis et at . 1953 ). There, 
an update from a configuration x to a new configuration 
y is proposed with a probability W^y°P but accepted only 
with probability W^'^. If the proposal is rejected the old 
configuration x is used again. The transition matrix is 



WPropTiAacc 
xy xy 



(27) 



and the detailed balance condition (|26p is satisfied by 
using the Metropolis-Hastings acceptance rate 



Tl/acc 



min [1, i?x 



with the acceptance ratio i?xy given by 

_ p(y)^yTP 



(28) 



(29) 



and i?yx = l/i?xy To simplify the notation we will 
often quote just i?xy, and imply that min[l,i?xy] is the 
actual acceptance probability. Note that the acceptance 
ratio i?xy includes both the weights and the proposal 
probabilities. In the following sections we will always 
specify both the proposal probabilities W^y"^ and the 
acceptance ratios i?xy. 



,3 

n T-1 T2 T3 



FIG. 1 Diagrammatic representation of configurations x — 
{{k;Ti, . . . Tfc)} G C showing examples with orders fc = 0, 1, 2, 3 
and vertices (represented by dots) at times ri, . . . , ra. 



could cause difficulties with proposal and acceptance 
prob abilities in the random walk in configuration space . 



As (IBeard and WieS' 



199 



Prokof ev et all 




■Prokof'ev and Svistunov . 
iggsh showed, this is not 
the case. 

The various algorithms reviewed here differ in the rep- 
resentations, weights, and updates, as well as in the most 
convenient representation for the measurement of observ- 
ables, but all express the partition function in the gen- 
eral form ([50]) . To illustrate the Monte-Carlo sampling of 
such continuous-time partition function expansions and 
in particular to demonstrate that the infinitesimal does 
not cause problems, we consider the very simple partition 
function 



Z 



E/ drj dr,--- 

k=Q "'O "'0 



^ ^ w{k) 



(33) 



C. Diagrammatic Monte Carlo - the sampling of path 
integrals and other diagrammatic expansions 

The partition function Eq. (|19p may be expressed as a 
sum of integrals originating from a diagrammatic expan- 
sion: 



fcW(fc,7,Tl,...,Tfc), 



(30) 

which has the form of Eq. (pUj) . The individual configu- 
rations are of the form 



X = (fc,7, (ti, . . . ,Tfe)), 



(31) 



where k is the expansion or diagram order and 
Ti, . . . ,Tfc € [0,/3) arc the times of the k vertices in the 
configuration. The parameter 7 g includes all discrete 
variables, such as the topology of the diagram and spin, 
orbital, lattice site, and auxiliary spin indices associated 
with the interaction vertices. 
A configuration x has a weight 



p(x) = w(fc, 7, Ti, . . . , Tfc)dri • • • dTk, 



(32) 



which we will assume to be non-negative for now. The 
case of negative weights is discussed in Sec. III.DI Al- 
though these weights arc well-defined probability densi- 
ties they involve infinitesimals dr, which one might worry 



which using time ordering can be rewritten as 



A:=0 



dri I dT2 ■ 

Jti 



/3 



dTkw{k)- (34) 



The distribution describing the probability of a diagram 
of order k with vertices at times {tj} is (here we make 
the times explicit) 



p{{k,Ti,...,Tk)) = w{k)Y[dT,. 



(35) 



In the following we will always assume time-ordering ti < 
T2 < ■ ■ ■ < Tk and visualize the configurations using a 
diagrammatic representation as in Fig. [TJ 

Transitions between configurations x and y are real- 
ized by updates. Updates in diagrammatic Monte Carlo 
codes typically involve (i) updates that increase the order 
k by inserting an additional vertex at a time r and (ii) 
updates that decrease the order k by removing a vertex 
Tj . These insertion and removal updates are necessary to 
satisfy the ergodicity requirement and are often sufficient: 
we can reach any configuration from another one by re- 
moving all the existing vertices and then inserting new 
ones. Additional updates keeping the order k constant 
are typically not required for ergodicity but may speed 
up equilibration and improve the sampling efficiency. In 
some special circumstances, for example if all odd order 
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FIG. 2 An insertion update (top to bottom) inserting a vertex 
at time rs and the corresponding removal update (bottom to 
top), removing the vertex at time ra. 



diagrams have zero weight, updates which insert or re- 
move multiple vertices are required. 

In the following we will focus on the insertion and re- 
moval updates, illustrated in Fig. [2] For the insertion 
let us start from a configuration (fc, r) = (fc, ri, . . . , r^) 
of order k. We propose to insert a new vertex at a 
time T uniformly chosen in the interval [0,/3), to ob- 
tain a new time-ordered configuration (fc -I- 1,t') = {k + 
l,Ti,. . . ,T,. . . ,Tk) = {k + 1,t{, . . . ,r^_^J. The proposal 
rate for this insertion is given by the probability density 

(36) 



The reverse move is the removal of a randomly chosen 
vertex. The probability of removing a particular vertex 
to go back from (fc -|- 1,t') to (fc,T) is just one over the 
number of available vertices: 

^ (37) 



wProp 



fc+ 1 



To obtain the Metropolis acceptance rates we first cal- 
culate the acceptance ratio 



R. 



(fc,T),(fc+l,r') 



w{k + l)dT[ 



p{{k + i,f'))wr'' 



(k+l,T'),(k,T) 



p{{k,T)) 



^^ik,T),{k + l,f') 



(38) 



■dT',^,l/{k + l) _w{k + l) p 



w(k)dTi ■ ■ ■ dTk dr / /3 w{k) k + 1 

Observe that all infinitesimals cancel: the additional in- 
finitesimal in the weight p{{k + 1,t')) is canceled by the 
infinitesimal of the proposal rate for insertions. 

Equation [35] implies that the acceptance rates W^'^'^ 
are well defined finite numbers given by 

I = min [l, R(k,r),{k+i.T')] , (39) 

,(fc,r) = min [l, l/i?(fe,?),(fc+i,r')] • (40) 
Acceptance rates for updates that preserve the order fc, 
such as shifting some of the times or updating the dis- 
crete parameters 7 are straightforward to evaluate since 
there all infinitesimals cancel trivially. 

The general scheme of diagrammatic Monte Carlo al- 
gorithms is illustrated in Fig. [3] One cannot stress of- 
ten enough that measurements are performed again on 
the old configuration if the proposed update has been 
rejected. 



Ti^acc 
Ti/acc 




meciaure 
observables 



compute proposal 
probability of update 
and inverse update, 
compute probability 
ratio of new to 
old configuration 




reject change, leave 
old configuration 



set configuration to 
new configuration 



FIG. 3 Continuous-time Quantum Monte Carlo flow diagram. 



D. The negative sign problem 

Until now we have tacitly assumed that the expan- 
sion coefficients of our partition function expansion are 
always positive or zero. This has allowed us to interpret 
the weights as probability densities on the configuration 
space and the stochastic sampling of these configurations 
in a Monte Carlo simulation. If the weights p(x) become 
negative, as is often the case in fcrmionic simulations due 
to the anti-commutation relations between fcrmionic op- 
erators, they can no longer be regarded as probabilities. 
The common solution is to sample with respect to the ab- 
solute value of the weight p{x) = \p{x))\ and reweight the 
measurements according to Eq. . The ratio p(x) / p(x) 
is then just sign(p(x)) = p(x)/|p(x)|. This gives for the 
average ((2T|) 



(A) 



{A ■ sign)|p| 



(sign) 



(41) 



which can be evaluated by sampling numerator and de- 
nominator separately with respect to the positive weight 

b(x)|. 

While sampling with the absolute value and reweighing 
allows Monte Carlo simulations of systems with negative 
weights, it docs not solve the "sign problem" . Sampling 
Eq. (j4ip suffers from exponentially growing errors. To 
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sec this let us consider the average sign 



(sign; 



J^dx sign(x)|p(x)| _ ^ 
/^dx|p(x)| Z|p| 



(42) 



which is just the ratio of the partion function Z and the 
partition function of a "bosonic" system with positive 
weights |p(x)|. This ratio can be expressed through the 
difference AF in free energies of these two systems 



(sign; 



z 
z' 



cxp(-/3AF), 



(43) 



and decreases exponentially as the temperature is low- 
ered or the volume of the system increased. 

The sign problem is thus the accurate measurement of 
this near-zero sign from individual measurements that 
are -1-1 or —1 , a cancellation problem. The variance of 
the sign is 

Var sign (sign^) - (sign)^ = 1 - exp(-2;3AF) « 1 

(44) 

and the relative error after M measurements 



Asign 



^Var sign/A/ exp(/3Ai^) 



(sign) 



M 



(45) 



grows exponentially with decreasing temperature and in- 
creasing system size. 

The sign problem has been proven to be nonde- 
terminstic polynomial (NP) hard, and hence in gen- 
eral no polynomial time solution is believed to exist 
(jTrover and Wiesd . l2005h . However, the severity of the 
sign problem (in the notation of Eq. ()45p the magni- 
tude of the coefficient exp(/3AF)) depends both on the 
model considered and on the representation chosen for 
the model. Impurity models tend to have less severe 
sign problems than comparable finite-sized lattice mod- 
els ('turning off' the coupling to the bath often makes 
the sign problem worse). In special cases the sign prob- 
lem is absent. For example, Yoo and coworkers proved 
that there is no sign problem in Hirsch-Fye simulations 
of the single impurity s ingle orbital Anderson impurity 
model (|Yoo et all 120051 ) , and this proof can be easily ex- 
tended to some multi-orbital models and adapted to the 
continuous time algorithms presented in this review. 

A trivial sign problem arises if the operator — i/h is 
negative and odd perturbation orders are allowed. A 
simple example is the weak coupling expansion of the 
repulsive (positive U) Hubbard model. In this particular 
case the sign problem may be avoided by a trick discussed 
in Section HlO] 

In the hybridization expansion a severe sign prob- 
lem may occur if the hybridization function and the 
bare level energy do n ot commute [A*^^, £'"''] ^ 
(|Wang and MiUii 120101 ). An apparently related diffi- 
culty occurs in the weak coupling approach if the hy- 
bridization function and interaction are not diagonal in 



the o rbital/spin occupation number basis (iGorelov et ali . 
20091) . In the larger systems dealt with in cluster dynam- 
ical mean field theory, fermion loops occur and produce a 
sign problem. Because the sign problem is model and rep- 
resentation dependent, further discussion is postponed to 
the sections pertaining to specific algorithms. 



III. INTERACTION EXPANSION ALGORITHM CT-INT 

The interaction expansion algorithm CT-INT was the 
first continuous-time impu r ity so lver to be introduced 



(jRubtsov and Lichtensteinl |2004[ ). It proceeds from 



Eq. (|19p . with Hf, taken to be the interaction part H^^^ of 
Eq. (HI), and Ha ^ Hbath + i?hyb + -^loc (sec Eqs. ^ and 
([2|)). It has a better scaling with system size than the 
hybridization algorithm and can treat more general in- 
teractions than CT-AUX. A "trivial" sign problem arises 
for repulsive interactions, where terms of the form {—U)^ 
appear. Elimination of this sign problem is an important 
issue in the design of the algorithm. 



A. Partition function expansion 

Wc illustrate the method by considering the simplest 
model, the one orbital single site Anderson impurity 
model Eq. (jl2p which, for this expansion, is most con- 
veniently formulated in terms of the action S ^ Sa + Su 
with 

^o = -E lfdTdT'dl{T)gl{T-T')-'d,{T'). (46) 



Su = U dTn^{T)ni{T), 



(47) 



where ~ {i^n ~ co ~ A„)^^ , and eo is the impurity en- 
ergy level. We consider more general models in Sec. lIII.D] 
The expansion of the partition function in powers of U 
reads 



Z/Zo = 1 



(48) 

dTi(?i^(ri)n;(Ti))o 

dTi dT2 (nt (n )n4_ (ti )nt^ (t2 ) nj, (t2 )) 



1! J, 
2! 



where the notation (. . .)o = ^ T^[d\d]e^^°[. . .] de- 
notes an average in the non-interacting ensemble with 
quadratic action Sq (see low order terms in Fig. 31), and 
= / T>[d\d]e-^«. Employing Wick's theorem (fwickl . 
19501) ' we may express the expectation value in terms 
of determinants of the non-interacting Green's function 
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dlin) di(Ti) 




FIG. 4 Depiction of a third order term in the weak coupling 
expansion. Upper panel: Hubbard interaction vertices de- 
noted by circles. Each Unf{T)ni{T) - vertex has four opera- 
tors. Lower panel: one possible contraction of the interaction 
vertices. 



{n^{Ti)ni{Ti)n^{T2)ni{T2) ■ ■ ■ n^{Tk)ni{Tk))a = 

detT)ldetT>i, (49) 
(D^),, =CyO(r,-r,). (50) 

Summing the contractions into a determinant instead of 
sampling them individually reduces the size of the con- 
figuration space and avoids a sign problem coming from 
the fcrmionic exchange. 

We thus arrive at the following series for the partition 
function: 



k=0 



k\ 



dTi...dTk rQdctD^ 



(51) 



Two "sign problems" may potentially occur in this ex- 
pansion: an "intrinsic" sign problem arising from fermion 
exchange because the determinants might become nega- 
tive and a "trivial" sign problem, arisi ng for U > from 
the {—U)^ factor. The arguments of IYoo etaU (j2005l ) 
prove that for the single impurity Anderson model each 
of the determinants is no-negative, so there is no intrin- 
sic sign problem. This is not necessarily the case for the 
more general models considered in subsection llll.DI The 
"trivial" sign problem arising for [/ > can be managed 
in several ways . For the single band, single-impurity An- 
derson model (jRubtsovl . 120031 ) showed that the replace- 



ment (i| 

parameters of the effective action: 



d^.di — >■ d\ leads to following changes in the 



£04. 



Cot ^ Sot + 



A^(r) ^ -A^(-r) 
U ^ -U. 



(52) 



The repulsive interaction becomes attractive and the 
"trivial" sign problem due to the interaction term van- 
ishes. 

This approach performs a particlc-holc transformation 
on the down spins only such that up and down spins arc 



treated inequivalently. While the entire series formally 
maintains spin inversion symmetry (in the absence of a 
magnetic field) , restoring it dynamically by Monte Carlo 
sampling is challenging in practice. It is better to avoid 
the symmetry breaking as follows. 

First, observe that the transformed Hamiltonian can 
be viewed in the original variables as an expansion in 
Un^{n\^ ~ 1); this leads to a down-spin determinant with 
diagonal elements replaced by t/°(0) — 1. The absence of 
a sign problem means that the down spin determinant 
must generate a minus sign that compensates the {—U) 
factor. This approach may be generalized: expanding in 
powers of 



Su^U dr {n^{T) - af) {ni{T) - ai) , (53) 
Jo 

with the corresponding change eoa — > eo<T — Ua^a, — > 
G° in S'o, leads to 



detD^ = ^TV[7v(ti) - Oo-] • • • [na{Tk) - "o-])^ 



(54) 



Rubtsovl ( 2003 ) showed that for + = 1, a^a J,< 
the trivial sign problem is absent. 

Finally, it is advantageous to avoid this explicit sym- 
metry breaking at the cost of introducing an auxiliary 
field s =t, J, and expanding in powers of 

Expanding this action wc get an additional random vari- 
able Si =^t\- at each vertex that needs to be sampled 
over. In practice this does not introduce any difficulties: 
all expressions remain the unchanged, apart from an an 
additional index . „ instead of Ur, in the determinants 



of Eq. (|54|. 

In the actual calculation it is useful to take the param- 
eter = 0.5 -|- 5 for s = a and Usa = ^5 otherwise. 
In principle, 5 can be taken to be zero but setting it to 
a small positive value 5 « 0.01 allows to avoid numerical 
instabilities due to nearly-singular matrices. 

An interaction expan sion has also been derived in 
( Assaad and Lang . 20071) for retarded interactions such 
as 

5rct = E dTdT'0''{T)W''\T - t')0\t'), (56) 

where O denotes a fermion bilinear. This formalism will 
be discussed in Sec. IVII.CI 
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B. Updates 

The series (|5ip and the corresponding one for ([55]) are 
of the type pop . and we can employ continuous-time sam- 
pHng as described in Sec. III. CI We insert and remove in- 
teraction vertices on the imaginary time axis, correspond- 
ing to the terms U{nf{T)-as^^){ni{T)-as^l) (see Figure 
[51) . Proposing a vertex insertion update with probabihty 
dr / (2(3) (for the imaginary time location and the orienta- 
tion of the auxiliary spin s^) and a removal update with 
probability l/(fc -f- 1) we obtain 



R 



(3U 



{k 



dctD 



fe+l 



detD? 



(57) 



Note that for the interaction defined in Eq. ((55)) the pref- 
actor i is compensated by the factor 2 in the ratio of 
proposal probabilities, which comes from the two possi- 
ble values of St-, so that the acceptance ratio is the same 
as in the straightforward approach. 

This update and its inverse are sufficient to be ergodic. 
In evaluating the determinant ratios the fast-update tech- 
nique described in Scc. lX.Al should be used, since it allows 
to calculate the ratio R in O(fc^) operations, substan- 
tially faster than the nai've evaluation of determinants 
with 0(fc'^) operations. 



i^R^ 



3i( 






^ — -A A 



FIG. 5 Local updates for the CT-INT algorithm, (a): start- 
ing configuration (b): Insertion of a vertex, (c): removal of a 
vertex, (d): shift of a vertex in imaginary time. 



is estimated by Gr^rx,...,TuTt, {t, t') (corresponding to A{x.) 
in Eq. (HI])): 



G 



Ga{T-T') = (G'riTi,...,rA,Tfc(T,T'))MC, (59) 
{Trdcy{T)d)^{T')ni„n2a ■ ■ ■nkcr)o 



TiTi,...,TfcTfc 



(T,r') = - 



{nian2a ■ ■ ■ nkcrJO 



(60) 



The (• • •)mc denotes a Monte Carlo average, while the 
(• • • )o denotes all possible Wick's contractions of one par- 
ticular Monte Carlo configuration. The denominator is 
a determinant that cancels the Wick's contraction of a 
partition function configuration p, and the numerator 
determinant consists of a matrix with an additional row 
Kir-r'), g"Ar-n),gUr-T2), . . .^G'Ar-rk)] and col- 
umn (5° (r - r'), C/2(n - r'), ^^"(r^ - r'), ^^^(r. - r')]. 

The configuration G'TiTi,...rfcTfc (''■, ''■') in Eq. ((60)) de- 
pends on two independent arguments r, r', while the ob- 
servable average Eq. ((55)) is time-translation invariant. 
This symmetry of the effective action is restored only af- 
ter the averaging in Eq. ((5^)) . It will be shown in Sec. IX. C) 
that it is best either to measure a quantity corresponding 
to SG or to perform a Fourier transform to Matsubara 
frequencies analytically, so that the Green's function is 
calculated directly in the frequency domain. 

There is one particular observable estimate that can 
be obtained just from the properties of the random walk 
itself, without any additional calculation: the average 
value of the perturbation operator. One can see from 
a term-to-term comparison of the respective series that 
the average perturbation order (fc) is proportional to the 
inverse temperature and the average value of the inter- 
action operator. 



(fc)MC = {Su) 



(61) 



Therefore, the expecation value of the interaction opera- 
tor UnfUi is {k)Mc/l3- 



D. Generalization to clusters, multi-orbital problems and 
retarded interactions 



C. Measurements 

Monte Carlo averages are calculated using Eq. pip and 
((22)) . where the distribution p of Eq. ((2T)) is given by 
the coefficients of Eq. ((ST)) . In particular, the Green's 
function 



G.{t-t') = - 




X ^T,d<,(T)4(T')nit(Ti)nu(Ti) • --nkiiTk))^ (58) 



In the case of the Hubbard model on a cluster, the only 
difference to the single orbital case is that creation and 
annihilation operators acquire an additional site index. 
We can absorb all quadratic hopping terms in C/" and 
perform the interaction expansion in 

Su ^U'^{ni\ - ai^){nii- ail), (62) 

i 

where i runs over the sites of the cluster. The tticr-terms 
are chosen as in the single site case; optionally with an 
auxiliary spin Si at each site. 

The Green's functions Gij{Ti — Tj) are site-dependent, 
but the spin up and spin down contributions still factor 
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into separate determinants: 



I ...... E ^?^ndetD 



(63) 



where (D^)^ 



Sl---Sfc=±l 



It follows immedi- 



ately that there is no sign problem in the half- filled case, 
where the determinants of the up- and down matrices are 
identical. However, away from half filling a si gn problem 
occurs in general, see e.g. Fig. 5 in Ref. (jGuU et all 
20083). 

For the updates a generalization of Eq. (|57p should be 



used, where (3 is replaced by the factor l3Nc, with Nc the 
number of sites in the cluster. 

In the general case of multiple orbital problems intrin- 
sic sign problems typically occur, and management even 
of the trivial sign problem becomes more involved. The 
basic idea is to express the interaction H^^^ [Eq. ^] in 
action form as 

5ioc = E// drdT'lP^-'idlds - apsKdldr - a,.) (64) 

pqrs 



IV. CONTINUOUS-TIME AUXILIARY FIELD 
ALGORITHM CT-AUX 

A first continuous-time auxiliary field method 
for fermionic lattice models was developed by 
Rombouts et ali ([1998, Il999l) . and applied to the 
nuclear Hamiltonian and small Hubbard lattices. We 



present here a different formulation (jGull et all . l2008al ) 



that is also applicable to (cluster) impurity problems. 
This continuous-time auxiliary field (CT-AUX) algo- 
rithm is based on an interaction expansion combined 
with an auxiliary field decomposition of the interaction 
vertices. One may view CT-AUX as an "optimal" 
Hirsch-Fye algorithm, on a non-uniform time grid and 
with a varying number of "time slices" that are chosen 
automatically for given parameters. The approach allows 
the combination of numerical techniques developed for 
the Hirsch-Fye algorithm (see e.g. Sec. IX.B.1|) with the 
advantages of a continuous-time method. It was shown 
to be equivalent to the weak coupling algorithm in the 



case of the single-band Hubbard model (jMikelsons et al. 
|2009b l). Currently the CT-AUX impurity solver is the 
method of choice for large cluster simulations. 



and then perform a multiple expansion in the interac- 
tions . In multi-oribtal systems the number of terms 
proliferates; for N orbitals there are of order terms, 
although in practice some of them vanish by symmetry. 
Denoting the tuple (pqrs) at vertex i by we have 



4: = J2J2 I dn...dT, 



fc=oei...Cfc' 

= (rfL^s - ap,){dl dr^ -aqr). 



(«6 •••««Jo, 
(65) 



If we insert random (nonzero) matrix elements at random 
times in [0, /?), the prefactor of the acceptance probability 
ratios in Eq. ([57]) . /3U /{k + 1), is modified by a factor N^, 
becoming l3IN^/{k + 1). 

Wick's theorem of Eq. yields a determinant similar 
to If the Green's function matrix for the dif- 

ferent orbitals is diagonal in the orbital indices, the de- 
terminant factorizes into smaller-size determinants, each 
However, in general there is no reason for the determi- 
nant of D to have the same sign for all configurations. 
The choice of a-terms has an influence on the sign statis- 
tics, and they need to be adjusted for each problem such 
that the expansion is sign - free or at least has an av- 
erage sign that is as large as possible. How this is best 
done is still an open question . An ansatz has been pre- 
sented in Ref. (jGorelovL 120071 ). The basic principle is to 
treat the off-diagonal interaction terms with small but 
non-zero a, whereas the symmetrized form (|55p is used 
for the density-density part. 



A. Partition function expansion 

We present the derivation for the case of a cluster im- 
purity problem with Nc cluster sites. The generaliza- 
tion to multiorbital models with density density interac- 
tions is straightforward. The application to more general 
multiorbita l models would involve techniques similar to 
Sakai et MI (l2006allbl ) and has not yet been attempted. 
Starting from the partition function Z = Tre~^'^''+^^' 
we add a non-zero constant K to Hij: 



Ho = Faim -Hu+ K/P, 



(66) 
(67) 



such that 



Z = Ti 



(68) 



Expanding the exponential in po wers of Hj; and applyini 
the auxiliary field decomposition ( Rombouts et a^.l . ll99 



/3U 



n-i-f + rill 



2A^c . 



(69) 

cosh(7) = l + ^, (70) 
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FIG. 



Pictorial representation of configurations 



{k;{sj,Tj)} G C wliich are sampled by the CT-AUX 
algorithm. Diagrams for orders 0, 1, 2, and 3. In this 
algorithm, an auxiliary spin Sj (represented here by the red 
and blue vertices and the direction of the arrows) needs to 
be sampled in addition to the imaginary time location r, of 
a vertex. 



we obtain 

oo 



fc=0 si. - sfc 

=±1 



dri... [ dTk ^ ^ 

JTk-l 



\2l3Nc 



Zk{{si,Ti,Xi}) = Tr]^ 



g-AnffogSi7(niiT- 



Zki{sk,Tk,Xk}), 



(72) 



with Avi = Ti+i — Ti for i < k and Ar^ = /3 — + ti. 
Equation (I7 2p is very similar to th e equations 



for the BSS ( Blankenbecler et all 11981 ) or Hirsch- 



Fve (iHirsch and Fyel 1986 ) algorithms (see also 
( Georges et all 1996[ ). appendix Bl). Using the identity 

1 + e~''^e~^e~^) and following the derivation in 



Gull et am2008al) . we obtain 



^fe({^ 



.}) 



n detN-i({s„T„xJ), 



dia: 



iag(f 



,7(-l)"si 



, ,e 



(73) 

(74) 
(75) 



with the notations (—1)^ = 1, (—1)"^ = —1 and 
(G^:-^"'^).,. = Gl.,..ir, - rj) for ^ ^ J, (G^^-^'^),,, = 
t/°.2..^(0+) (we assume in this section that Gx,x,.cri^~^) > 
0) . As we handle a variable number of time slices at con- 
stantly shifting imaginary time locations, it is advanta- 
geous to formulate the algorithm in terms of a matrix No-, 
defined by = N^Go^ instead of G. With Eq. ^ we 
express the weight of any (auxiliary spin, time, site) - 
configuration in terms of the bath Green's function Q^, 
the constant 7 defined in Eq. (jBH]), and the determinant of 
two matrices N^. The contribution of such a configura- 
tion to the whole partition function is given by Eq. (1731) . 

B. Updates 

In the CT-AUX-algorithm the partition function 
Eq. (j7ip consists of a sum over all expansion orders k 




i 

T2 



FIG. 7 An insertion update and its corresponding removal 
update within the CT-AUX algorithm. 



up to infinity, another discrete sum over auxiliary fields 
s and sites x, and a fc-dimensional time-ordered integral 
from zero to /3, so we can employ the sampling scheme of 
chapter III. CI 

In addition to the imaginary time locations of the in- 
teraction vertices we also need to sample auxiliary spins 
(71) -^j associated with each vertex. Thus, the configuration 
space C (Eq. EQ)) is given by the set 

C ^{{},{{siiTl,Xl)},{{si,Ti,Xi),{s2,T2,X2)}, (76) 

••• ,{(si,Ti,a;i),--- ,{sk,Tk,Xk)},- ■ ■}, 

where the Sj are auxiliary Ising spins that take values 
±1, k is the expansion order, Xj denotes cluster sites and 
the Tj are continuous variables between and /3, which 
we assume to be time-ordered, i.e. ti < T2 < ■ ■ ■ < Tk- 

Note that this representation is d ifferent from the one 
proposed in (jRombouts et al. . Il999ll . where the configu- 
ration space consists of a number A^max of fixed "slots" 
at which interaction operators can be inserted into an 
operator chain (a "fixed length" representation). This 
leads to additional combinatorial factors in the accep- 
tance probabilities. 

Although they are not sufficient for an ergodic sam- 
pling, we first consider spinflip updates at constant order 
which are fast to compute and very useful for reducing 
autocorrelation times: 



{{si,Ti,Xi) 
{{si,Ti,Xi 



, {Sj, Tj, Xj), ■ 



{sk,Tk,Xk)) 



Xj), ■■■ , {sk,Tk,Xk)), (77) 



the probability density ratios of the two configurations 
are computed from Eq. ([73|) as: 



R = 



(78) 



Vertex insertion updates from configuration x = 
{si,Ti,Xi} to configuration y = {s^, t/, a;^}, on the other 
hand, have to be balanced by a removal updates (Fig.[7|). 
The proposal probability accounts for choosing a random 
time between and /3, a random site, and a random spin 
direction: 



1 dr 

wprop _ ^ 



(79) 
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The proposal probability of removing a spin, going from 
order fc + 1 to order k consists of choosing one of fc + 1 
spins: 

1 



k + 1' 



Therefore we obtain, following Eq. 

K detN^(y)dctN^(y) 



(80) 



(81) 



k+l det Nt (x) det (x) ■ 

The efficient numerical computation of these expressions 
is discussed in Sec. IX. A l and IX. Bl 



C. Measurements 

1. Measurement of the Green's function 



2. Role of the parameter K - potential energy 

Similar to the weak-coupling expansion parameter a 
of Sec. mil the parameter K of Eq. (|66| can be freely ad- 
justed. The average perturbation order (fcctaux) is related 
to K, the potential energy and filling by 

(^ctaux)MC = K - I3U {n-fTii - {n-^ + n^)/2), (86) 



and hence the perturbation order in the continuous-time 
auxiliary-field method grows linearly with K. 



The main observable of interest is the Green's function 
Gpq^cr{T,T') for cluster sites p and q and spin a. First, 
let us note that we are free to add two additional "non- 
interacting" spins s = s' = to Eq. ([7^ at any arbi- 
trary time T and r' (we denote the corresponding matri- 
ces of size n + 2 with a tilde). ZGpq^a-iT, t') is then given 
by an expression similar to Eq. (j72p . with an insertion 
of dair) and (i^(T') at the corresponding times. Using 
the s ame linear algebra a s in Hirsch-Fye (Eq. (118) of 
Ref. (jGeorges et a^.l . ll996^ 1 we obtain 



( K 



fe>0 



2/3 A^c 



E 

l<i<k 



dri 



P 



Si.ri,Xi], 
pq,t7 V ' 7 ' / 



with Gpg^cT"'^''^ 

s' = 0, a block calculation yields 

k 



(82) 
Since s = 

(83) 



1.771 — 1 



Mi,n = [(e^" - l)N^({si,Ti,a;J)];™, , 

(84) 

and Gpg,a{T,T') = {Gpg.a{T,T')}MC- As in the CT-INT 
algorithm we may Fourier transform the above expression 
to obtain a measurement formula in frequency space: 

Gpq{iuJn) = Gpg^iuJn) 



(85) 



Im 



By accumulating the Fourier coefficients directly, we 
avoid many of the discretization and related high fre- 
quency expansion problems (Sec. IX. C]) . 

A closer analysis of Eq. shows that it is possible 
and advantageous to measure S = MC7° = SG directly, 
as will be discussed in Section IX.C. II 



V. HYBRIDIZATION EXPANSION SOLVERS CT-HYB 
A. The hybridization expansion representation 

A complementary approach to the CT-INT and CT- 
AUX solvers described in chapters IIIII and IIVI is the hy- 
bridization expansion algorithm (CT-HYB ) developed by 



Wern e r, Millis, Troyer, an d colla borators (jWerner et al. 



20061: IWerner and Millisl . |2006[) . It proceeds from 



Eq. ([T^ with Hf, taken to be the hybridization term iJhyb 
and Ha = -ffbath + Hioc- An advantage of this approach 
is that the average expansion order for a typical problem 
near the Mott transition is much smaller than in the in- 
teraction expansion methods and there fore lower temper- 
atures are accessible ( Gull et all 120071 ) . General interac- 



tions can easily be treated as long as th e local Hilbert 



space is not too large. The first paper, (jWerner et 



20061 ) ■ presented an algorithm and applications for the 
single impurity Anderson model. A generalization to 
multi-orbital models with co mplex interactions a nd th e 



Kondo model soon followed ( Werner and Millis . 2006f) 



and this formalism was later extended by Hauld (2007) 
who introduced the ideas of basis truncation and sec- 
tor statistics, and implemented the algorithm for models 
with off-diagonal hybridization functions. 

Since i/hyb = E„ (V^cld.+Vfdlcp) = H,,yy, + H^^^ 
contains two terms which create and annihilate electrons 
on the impurity, respectively, only even powers of the ex- 
pansion and contributions with equal numbers of iJhyb 
and ^hyj, can yield a non-zero trace. The partition func- 
tion therefore becomes 



Z = Y, dn... drj dT[... drl (87) 

X Tr [Tre~^"-H^MHl^{T'^) . . . H^MhI^{t[) 
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Inserting the -ffhyb and -ffjy^, operators explicitly yields 

^.f} ^13 „/3 

Z^J2 dn... dn dT[... dr'^ 



X Tr 



T^e-^"^d,,{r,)cl{n)c,Jr',)d], 



•••d..(n)4,(Ti)cpiK)4(rO . 

Separating the bath and impurity operators we obtain 

rP rfi rP rP 



Z = ^ / dn... dTk dT[... drl 



(89) 



01, ■ ik Pi-'-'Pk 



X Trrf [r.e-^^-d,, {Tk)d], (r^) • • • d,, {n)d], {t[) 

X Tr, [T.e-^^--ct^ (r,)c,^, (r^) • • • ^ ^ (ri)c,i (r{ )] . 

We can now integrate out the bath operators Cp(t), since 
they are non-interacting and the time-evolution (given 
by Ha) no longer couples the impurity and the bath. 
Defining the bath partition function 



^bath 



Tre 



(90) 



a p 



and the anti-periodic hybridization function A 

(Eq. m), 



'5 + 1 



we obtain the determinant 
1 



-e 



< T < /3 
-/3<T<0 ' 

(91) 



-Tr, 



ath 



Pi, Pi,--X, 

x4.(^fc)cp.'W)-- -41(^1)^1 W)] -detA, (92) 

for an arbitrary product of bath operators. Here, A is 
& k X k matrix with elements Aim = A.j^j^{Ti — Tm). In 
practice, and in analogy to the algorithms in previous 
sections, it will be more convenient to handle the inverse 
of this matrix A, which we denote by M = A~^ (see 

Sec. Eg. 

The partition function expansion for the hybridization 
algorithm now reads (for time-ordered configurations) 



Z = ^bath ^ jjj dTi-'-dr',^ ^ ^ 

jl,- ■■ 3k ][,■■■ j'^ 

r, [T.e-'^^-d,, (rfe)^, (r^) • • • d,, [n)d], {t[) 



i if i "^d' 

W'* ^ 



FIG. 8 Segment representation of term in hybridization ex- 
pansion of single orbital Anderson model. Upper line: spin 
up orbital, lower line, spin down orbital: heavy line, orbital 
occupied; light line, orbital empty. For each orbital, length 
of black line (occupied orbitals) determines the chemical po- 
tential contribution to the weight factor (|95[) . Shaded areas: 
regions where both up and down orbitals are filled, so the 
impurity is doubly occupied. The length of the shaded area 
enters into an overall weighting factor for the potential energy 
(Hubbard U). 



If the coupling to the bath is diagonal in the "flavor" 
(spin, site, orbital, ...) indices j, then A is a block- 
diagonal matrix and Eq. (|93p simplifies to 



Z = Z\ 



bath 



HE 

j kj=0 



drf 



dr'i. 



X Tr, 



[lVe-^«- (T^^d] (r^ ) . . . (rf )d] (r{^ ) 



(94) 



dot A,- 



B. Density - density interactions 

We first consider (multi-orbital) models with density- 
density interactions. In this case, the local Hamiltonian 
H\oc commutes with the occupation number operator of 
each orbital. We may therefore represent the time evolu- 
tion of the impurity by collections of "segments" which 
represent time intervals in which an electron of a given 
flavor resides on the impurity. An example of such a seg- 
ment configuration for a single orbital model (two spin 
flavors) is shown in Fig. [5] 

Since the local Hamiltonian is diagonal in the occu- 
pation number basis the contribution of the trace factor 
can be computed for each segment conflguration. For a 
model with n orbitals and a total length Lj of segments 
in orbital j and a total overlap Oij between segments of 
flavor i and j one obtains (s is a sign depending on the 
operator sequence) 



wioc(x) = Trd[. . .] 



(95) 



except in the trivial case where there are no operators for 
certain flavors. In the latter case, several segment config- 
urations, involving "full" and "empty" lines, contribute 
to the trace. 



X Tr, 



(93) C. Formulation for general interactions 

If Hioc is not diagonal in the occupation number ba- 
detA. gjg defined by the d]^, a separation of flavors, as in 
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Hlcc 



ODO ♦ 



20081 : IShim et ali 120071 ). The Krylov method described 
in the next section avoids truncations to a large extent. 



FIG. 9 A typical term in the expansion (|93|l : three "flavors" 
(red, blue, yellow) of fermionic creation and annihilation op- 
erators (denoted by filled and empty triangles, squares, and 
circles) are placed at times between and /3. In the general 
case, orbital occupation is not conserved by local Hamiltonian 
so two operators of the same type may follow each other. 



the segment formalism, is no longer possible (see il- 
lustration in Fig. [9]) and the calculation of zi;ioc(x) = 
Tr4T.e-'3^- rfaK"j4(r^ ) . . . d.(ri")4(rf )] be- 
comes more involved. One strategy, proposed in 
(jWerner and Millisl . l2006l ) IS to represent the operators 
da and (1]^ as matrices in the eigenbasis of iJioc because in 
this representation the time evolution operators e~^^°'='^ 
become diagonal. The evaluation of the trace factor thus 
involves the multiplication of matrices whose size is equal 
to the size of the Hilbert space of iJioc- Since the dimen- 
sion of the Hilbert space grows exponentially with the 
number of flavors, the calculation of the trace factor be- 
comes the computational bottleneck of the simulation, 
and the matrix formalism is therefore restricted to a rel- 
atively small number of flavors (< 10). The technical 
part of evaluating these traces is described in detail in 
SecEB 



(j2007l ) observed that conserved quantum num- 



bers may be exploited to facilitate the calculation of the 
trace. If the eigenstates of Hioc are ordered according to 
conserved quantum numbers, the evaluation of the trace 
is reduced to block matrix multiplications (see Sec. IX. F[) 
of the form 



wioc(x) = ^ Tr„ 

contr.m 



(96) 



■■■(O) n ,('p-(^'-^)^l°'=l ,(0\ , fp-^-f^locN 

V^/m",m'V^ Jm' \^ jTn' ,m\^ Jr. 



where O is either a creation or annihilation operator, m 
denotes the index of the matrix block, and the sum runs 
over those sectors which are compatible with the operator 
sequence. With this technique, 3-orbi tal models or 4-site 
clusters can be simulated efficiently (IChan et al\. 200£ ; 
GuU aLl.l2008bl:lHaule and Kotli"aill2007bHPark et al 



2008b : Werner et al\ . 2008() . However, since the matrix 



blocks are dense and the largest blocks still grow expo- 
nentially with system size, the simulation of 5-orbital 
models becomes already quite expensive and the simu- 
lation of 7-orbital models with 5, 6 or 7 electrons is only 
feasible with current computer resources if the simulation 
is restricted to a few valence states and, within this sub- 
space, the maximum size of the blocks is truncated (see 
IX.F.2[) . Simulations based on such a truncated version 



D. Krylov implementation 



An alternative strategy to evaluate the trace in 

20091) 



Eq. ((93|) was proposed in (jLauchli and Werner 



based on the observation that in the occupation num- 
ber basis both the dj-^^'-operator matrices and H\oc are 
typically very sparse, so the c?^^''-operators can easily be 
applied to any given state while efficient Krylov-space 
methods can be used to evaluate the imaginary time evo- 
lution. This implementation involves only matrix-vector 
multiplications with sparse operators S-^'^ and i?ioc, and 
is thus doable even for systems for which the multiplica- 
tion of dense matrix blocks becomes prohibitively expen- 
sive. Furthermore, no explicit truncation of states of the 
local Hamiltonian is required, so that all excited states 
remain accessible at intermediate times t in the trace. 
The outer trace may be approximated by a sum over the 
lowest energy states. If this is done it is important to 
measure the various local observables at t = (5/2 in or- 
der to be least affected by the truncation of the trace at 
T = (and equivalently at t = /?). 

The complexity of the Krylov algorithm is 0(iVdim x 
^tr X ^hyb X Mtoi )j whcrc A^dim IS the size of the impu- 
rity Hilbert space, iVtr the number of states kept in the 
outer trace, iVhyb the number of hybridization events, 
and Mtcr the number of Krylov iterations used for the 
calculation of the time evolution from on e oper ator to 
the next. In Ref. (jHochbruck and Lubichl . 119971 ) it has 
been shown rigorously that these Krylov space algorithms 
converge rapidly as a function of Niter , typically reaching 
convergence for very small iteration numbers p ^ A^dim, 
although the number of iterations depends on the time 
interval r. In the worst case where all states in the trace 
are retained (A^tr = -A^dim) and the complexity scales as 
^dim' where as in the best case N^r = 0(1) and the com- 
plexity is linear in the dimension of the Hilbert space. 
In comparison, the complexit y of the approach described 



in Sec. IV.Cl is cubic in A'dim- Lauchli and Werner ( 20091 ) 



of the matrix formalism were used in (jMarianetti et al\ . 



showed that the Krylov approach with outer trace trun- 
cated to the lowest energy states becomes favorable for 
models with more than 4 orbitals (or 4 sites). The sys- 
tematic error resulting from the truncation of the outer 
trace becomes negligible at temperatures below a few 
percent of the bandwidth. The Krylov-based hybridiza- 
tion expansion thus provides a method for the systematic 
investigation of larger problems such as the dynamical 
mean field theory of transition metal and actinide com- 
pounds. 
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FIG. 10 An insertion update and its corresponding removal 
update within the hybridization algorithm. 



E. Updates 




/3 



FIG. 11 Updates of the hybridization algorithm as described 
in the text: (a) original configuration; (b) removal of a seg- 
ment; (c) shift of an end point of a segment; (d) insertion of an 
antisegment; (e) removal of an antisegment; (f ) removal of an- 
other antisegment such that the remaining segment "wraps" 
around /3. 



In order to sample Eq. (|93l) we perform a Monte Carlo 
simulation as described in Sec. III. Cl We explain the sam- 
pling procedure for the formulation with density-density 
interactions. The two basic updates required for ergod- 
icity are the insertion and the removal of a segment. 

Starting from a configuration of segments = 
{iT!,T^),{Ti^T2)r-- A'^k^^k)} we attempt to insert a 
new segment Sk+i starting at r" to obtain a configu- 
ration Yk+i- This move is rejected if lies on one of 
the existing segments, since we cannot create two iden- 
tical fermions at the same site. Otherwise, we choose a 
random time uniformly in the interval [r'' , r'' ) of length 
^max (Fig. [To)) , where is the start of the next segment 
in Xfc. For the reverse move, the proposal probability is 
given by the probability of selecting that given segment 
for removal. 

Therefore the proposal probabilities are 



WP™P = 

M'rnax 

Typrop ^ 

fc + 1 ' 



and the acceptance ratio becomes 

PyWyP™P _ /3;n.ax^'ioc(y)detA(y) 
k + 1 wioc(x) det A(x) 



R 



-xy 



-prop 



(97) 
(98) 

(99) 



An important second update, equivalent to the inser- 
tion of a segment, is the insertion of an "antisegment": 
the insertion of a annihilator-creator pair istead of a 
creator-annihilator pair. The formulae for the acceptance 
ratio are the same as Eq. (|M|) . Besides smaller auto- 
correlation times these updates cause the two zero-order 
contributions "full occupation" and "no segment" to be 
treated on equal footing. 

Further updates, like the shift of a segment or the shift 
of one or both end points do not change the order of the 
expansion, but help to reduce autocorrelation times. The 
shift moves are "self - balancing" (proposal probabilities 



for shift moves and their inverse are the same), so 



wioc(y)det A(y) 
lyioc(x) dot A(x) ' 



(100) 



Global updates (Sec. IX. G|) . e.g. the exchange of all 
segments of two orbitals, may be required to ensure er- 
godicity, i. e. that the random walk does not get trapped 
in one part of phase space. Such updates require the 
configuration to be recomputed from scratch, and are in 
general of order O(fc^). They are e ssential in calcula- 
tions of magnet i c phase boundarie s ( Chan et al . 20091: 



Potervaev etai\ . 120081: iKunes et all |2009| ). 



F. Measurements 

The CT-HYB algorithm generates configurations with 
the weight that they contribute to the partition function 
Z. To obtain expectation values of an observable we can 
either simulate the series of that observable (which, for 
the Green's function, corresponds to the "worm" algo- 
rithm described in Sec. IX. D|) . or estimate the observable 
according to Eq. (PT|) . 

The single most important observable for quan- 
tum Monte Carlo impurity solvers is the finite tem- 
perature imaginary time Green's function G;„i(t) = 
— {Trdi{T)dl^{0)) . The series for this observable is 



Gim(r;, r^) = -Zbath ^ / rfri...(iTfcdctAfcTrfl 
di{Ti)dliTm,)d,, {Tk)d], {tD . . . d,, (ri)dt (^r[) . 



(101) 



This shows that Green's function configurations at ex- 
pansion order k are partition function configurations at 
expansion order k with additional di and d\y^ operators 
or, alternatively, partition function operators at order 
A: + 1 with no hybridization line connecting to di{Ti) 
and c?J„(Tm). In practice we obtain an estimator of 
G imiTijTm) by identifying two operators di{Ti),dl^{Tfn) 
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Hlcc 



commute with the local Hamiltonian) : 



FIG. 12 Hybridization algorithm: Green's function configu- 
ration. A typical configuration for a Green's function, created 
by taking the partition function configuration of order fc = 3 
and identifying the creation operator at rl and the annihila- 
tion operator at rf as the Green's function operators to obtain 
a Green's function configuration corresponding to a partition 
function configuration at one order lower. Red: creation and 
blue: annihilation operators of the partition function. Light 
purple: Green's function operators. 



in a partition function configuration that are an imagi- 
nary time distance t ~ ti — t^h apart, and removing the 
hybridization line connecting them (see Fig. [T^ . The 
insertion of local operators into a partition function con- 
figuration, as it is done in the interaction expansion for- 
malism, is not ergodic in the hybridization expansion. 

The size (fc— 1) x (fc— 1) hybridization matrix A^'_;^'" of 
all hybridization operators except for di{Ti) and d'l^{Tm) 
corresponds to A with the column/row si and s,„ corre- 
sponding to the operators di and dl^^ removed, and the 
weight of a Green's function configuration Gimin, T,n) is 



PG,.. 



Z 



det A 



(102) 



An expansion by minors or the inverse matrix formulas of 
Sec. IX. Al describe how such a determinant ratio is com- 
puted: 



(A 



'SmSi 



Ms 



(103) 



We can bin this estimate into fine bins to obtain the 
Green's function estimator 



\ *J ' MC 



5(r,r') = 



(5(t-t'), t'>0 
-S{t-t'-P), t'<0, 



(105) 



with t(i) denoting the orbital index of the operator at 
row / column i. For a configuration at expansion order fc 
we obtain a total of fc^ estimates for the Green's function 
- or one for every creation-annihilation operator pair or 
every single element of the (fc x fc)-matrix M = A^^ . The 
measurement in Eq. (jlOSp may suffer from bad statistics 
if very few hybridization lines are present (fc is small) in 
an orbital. In this case, the Green function measurement 
should be based on the insertion of operators. 

In the segment representation, very efficient estima- 
tors exist for the density, the double occupancy and the 
potential energy (and similarly for all observables that 



i>j 

Dij = (ninj)MC- 



(106) 
(107) 



The occupation nj of the jth flavor is estimated by the 
length Lj (Eq. [55]) of all the segments: rij = {Lj/f3). 
Double occupancies and interaction energies are obtained 
from the overlap Oij of segments as Dij = {Oij)/j3. 
The system has a total magnetization of Sz = ((-^f°* ^ 
L|°*))//3. Overlaps and lengths of segments are com- 
puted at every Monte Carlo step, and thus these observ- 
ables may be obtained with high accuracy at essentially 
no additional cost. 

Finally the average expansion or der of the algo rithm is 
an estimator for the kinetic energy (jHauld . 120071 ) ■ similar 
to E^ot in the case of the CT-INT and CT-AUX algo- 
rithms: 



kin 



(fc-) 



MC- 



(108) 



VI. INFINITE-C/ METHOD CT-J 
A. Overview 

In many cases the physics of interest is captured by low 
energy effective theories in which some (often high en- 
ergy) degrees of freedom have been integrated out, leav- 
ing a model described by a restricted Hilbert space. A 
standard example is the "Schrieffer-Wolf" transformation 
which obtains the Kondo Hamiltonian (describing a sin- 
gle = 1/2 spin exchange-coupled to a conduction band) 
as the low energy theory of the single-impurity Ander- 
son model in the regime where the charge fluctuations 
are suppressed and the impurity is occupied by only one 
electron. 

The projection is conceptually advantageous, because 
it allows one to focus on the important degrees of free- 
dom. There is also a computational advantage: while 
the CT-HYB method accomplishes the projection (be- 
cause the simulation produces a large weight for the rel- 
evant states and a small weight for the states which are 
projected out), transitions between the states in the low 
energy manifold require excursions to states with very 
small weight, leading to large auto-correlation times. The 
direct study of a projected Hamiltonian avoids this prob- 
lem. 

Otsuki and collaborators have presented a CT- 
QMG method for d ealing with projected Hamiltonians 



(jOtsuki et am2007l ). Their papers focus on a particular 



class of "Coqblin-Schrieffer" (CS) or generalized Kondo 
models arising in the context of the physics of heavy 
fcrmion compounds. We follow their presentation here.be 
applicable to a much wider range of downfolded models. 
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Coqblin-Schrieffer models arise when an impurity 
spends most of its time in a state of definite charge, 
with occasional virtual fluctuations into different charge 
states. An example is the Anderson model, Eq. ([T^. in 
the large U, weak V limit where, if the level energy is 
correctly tuned, at almost all times the impurity is occu- 
pied by one electron which may be of spin up or down. 
Fluctuations into a state with density n = or 71 = 2 
followed by a return to a state n ~ 1 allow the impu- 
rity to exchange spin with the bath. More generally, the 
dominant charge state will have an A'^-fold degeneracy 
including spin and orbital degrees of freedom and virtual 
transitions will lead to a variety of exchange processes 
which may be e ncoded in a Hamiltonian of th e Coqblin- 
Schrieffer form ( Coqblin and Schrieffei . 1969t ) 



H, 



cs 



bath 



Hi 



(109) 



with impurity states labeled by a spin- or orbital quan- 
tum number a and a bath described by an energy quan- 
tum number k and a spin/orbital quantum number b: 



-ffbath = y^^Skcj^Ckb, (110) 
kb 

Jospin = EgXacc, (HI) 

a 

^J = - E j'T'x..c.'Ckbcl,,. (112) 



Here Xaa' = |q^)(q^'I ^-nd without loss of generality wc 
have chosen a basis in which the impurity (spin) Hamil- 
tonian is diagonal. The exchange parameters J are typ- 
ically of order V'^/U and in most treatments the k- 
dcpendence is neglected. Furthermore, in the applica- 
tions presented to date the spin-orbit quantum numbers 
of the bath electrons are those of the impurity states and 
are conserved so that Hj 



H^^ where 



(113) 



and Ca = 1/v A^X^fcCfca- The consequences of removing 
this approximation are an important open question. 

The formalism of Otsuki et al. follows Eq. ([T9|) with 
Hj playing the role of the expansion term Hi,. While for- 
mally this is a pcrturbativc expansion in an interaction 
parameter, it is in a practical sense closely related to the 
hybridization expansion: each vertex changes the state 
of the impurity, just as does the V-term in CT-HYB; the 
difference is that at each event, one electron and one hole 
is created. In what follows we summarize the main fea- 
tu res of the CT - J algo rithm, following the presentation 
bv lOtsuki et~d\ (|2007t ). 



B. Partition function expansion 

The partition function Z divided by the conduction 
electron contribution Zbath = TrcC^'^^'^"'' is 



Z 



Z 



00 „^ „i3 



bath ^ JO -^Tfc-i 



X Tr, 



spin 



[T,e-^^-"X„,„. in) ■ ■ ■ (Tfc) . 

(114) 



Here, the conduction electron operators are grouped by 
component index a (a resultant sign in the permutation 
is represented by s), k^ is the number of operators c^Cq, 
for each component a, and ~ ^- We also used the 

notation {■■■)c^ ^bathTrcb"^^'"-"'' . . .]. Wick's theorem 
for the conduction electrons implies 



Z 



k=0 
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E / 



/3 



= {-^fJa^c.[■■■Ja,a', ' S ^^^^ 



dTfc E ■ • • E (115) 



X Tr« 



r,e-'3^-'"X,,„; (n) • • • (Tfc) 



(116) 



The ka X ka matrix D^'^"'' is defined by (dJ^"')^ = 
gO(Tf - rj) with ^^(t) = -(T,c„(r)cl(0))c and Wk is 
the weight of a Monte Carlo configuration of order k. 
This configuration can be represented in terms of the 
numbers {rj = (ti, • • • , Tk) and {aj = (ai, • • • , ak), 
or, as illustrated in Fig. [131 by a decomposition of the 
imaginary time interval into k segments [Ti,Ti^i) (mod- 
ulo periodic boundary condition) with flavor a^. These 
variables define the sequence of X-operators 

(rfc)---X„.,._,(r,)---X„,,,(ri), (117) 

and a corresponding sequence of c-operators: 

(-l)''+'4,(n)ca,(rfe) . . .4(r,+i)c„.(r,) . . . 

...4(T2)c.,(ri). (118) 



An expression equivalent to Eq. 11161 was presented in 
the lan dmark 1970 Anderson - Yuval study of the Kondo 
model ( Yuval and Anderson . 1970l ). but at that time 
could not be used as a starting point for numerical cal- 
culations. 



C. Updates 

Updates which change the order k are required for er- 
godicity, and updates which shift one of the operators in- 
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FIG. 13 Illustration of an insertion of a segment. The dia- 
grams represent the configurations of {r^} and {q^}. 



crease sampling efficiency. In this section, we discuss up- 
dates which change the perturbation order by ±1. Note 
that if some coupling constants are 0, the straightforward 
sampling may not be ergodic. For example, when the in- 
teraction lacks diagonal elements in the iV = 2 model, 
the perturbation order must b e chaii ged by ±2. We re- 
fer the reader to (jOtsuki tt al\ . 120071 ) for a discussion of 
updates which insert or remove several operators. 

Let us consider the process of adding r and a, which 
are randomly chosen in the range [0,/?) and from the 
A'^ components, respectively. If t satisfies t„+i > r > 
T„, {t,} and {aj change into (n, • • • , r„, r, r„+i, • • • , Tfc) 
and (ai, • • • , q;„, a, Q!„-|_i, • • • , a^), respectively. In other 
words, one of the X-operators is replaced by 



eq. (|121[) is reduced to 



Vk 



J OLOL ' 



dctD 



(+) 



detDr 



(122) 



Here dI^"* is a matrix in which cJ,(T)cQ,(r-f 0) is added to 
the original one. The equal-time Green function in D^^' 
should be Q^{+0) to keep the probabili ty positive (for 
J > 0). For J < see the appendix in (jHoshino et all 
2OIOII . In the case A: = all states contribute to the trace, 
and therefore Pi/po is given by 



7<?a(+0). 



(123) 



The ratios of the weights in Eq. (|12ip - (|123p change 
their signs depending on the signs of t h e cou pling con- 



stants. It was found in (lOtsuki et all 120071 ) that the 



probability remains positive in the case of antifcrromag- 
netic couplings, i.e., Jaa' > 0. This is consistent with the 
fact that the CS model with antiferromagnetic couplings 
is derived from the Anderson model, where the minus 
sign problem does not appear. 

Staggered susceptibilities and o ther two-part i cle cor re- 
lation functions are discussed in ( Otsuki et "all. l2009bh . 



D. The Kondo model 



-^a„ + iQ„(T„+l) — > Xa^^-^aiTn+l)Xaa„{T), (119) 

which corresponds to the change illustrated in Fig. [131 a 
segment a is inserted between a„ and a„+i with short- 
ening of the segment a„. In the corresponding removal 
process, one removes a randomly chosen segment. 

Following the discussion in Sec. III. CI and taking into 
account the proposal probabilities dr/N/S and l/(fc + 1) 
for insertion and removal {N is the number of local 
states) the ratio R of Eq. ((29l) becomes 



Pk+i N(3 
k + 1' 



Pk 



with Pk = WkdTi . . . dTk as in Eq. 
the ratio pk+i/pk is given by 



Pk+i 
Pk 



(120) 



SS]) . where for k ^ 



, det dL+^ det D, 



det Dq, det Dq.^ 



(121) 



Here I ~ Tn+i — r is the length of the new segment. 
Ti\^^ is the matrix with cJj(T„4_i)ca(r) added to D„, and 
Dq,^ is the matrix with one of the operators shifted in 
time according to cJ,^(r„-(-i) — >• cjj^(r). The ratio of de- 
terminants can be evaluated in 0{k) using fast-update 
formulas (see Sec. IX. A]) . If a = a„ in Fig. [131 the change 
is just an addition of a diagonal element Xc(q,(t), so that 



Perhaps the most important projected model is the 
spin 5=1/2 Kondo model which is typically written as 



H 



ka 



Ckcr + JS • (7c 



(124) 



where CTc ~ tplat/jc is the Pauli matrix for conduction 
electrons {ipl = (c|,cj^)). While it is possible to simulate 
this model directly using CT-HYB (jWerner and Millid . 
20061) . it may be more convenient for some applications 
to re-express it in Coqblin-Schrieffer form by introducing 
pseudo-Fermion operators f\ f to represent the different 
states of the local moment: 5* = ■^ip^^aip f . Rearranging 
gives 



H = Y.^kcl^ 



Cka 



ka 



(125) 



which is of the Coqblin-Schrieffer form Eq. (jllSp with 
,/crcr' = J and with an additional potential scattering 
given by w = — J/2. 

Carrying out the CT-J expansion requires knowledge 
of the c-clectron Green function G{z) in the presence of 
the potential scattering v. G{z) may be expressed in 
terms of the bare {v = 0) Green function G^{z) as 



(126) 
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In the simulation of the CS model, G^{z) is replaced by 
G(z) and the calculation yields the impurity t-matrix 
tj{z), computed with respect to G{z). To obtain the 
t-matrix t{z) of the Kondo model, Eq. (|124p . the contri- 
bution of the potential scattering should be subtracted 
from G{z). The full Green's function G{z) can be ex- 
pressed as 



G = G + GtjG = g" + g'^tg^ 



(127) 



Solving Eq. ([T?7)) for t{z) gives 

V tj 



t = 



1-vg" (i-ugo)2- 



VII. PHONONS AND RETARDED INTERACTIONS 
A. Models 

In this section we present the application of CT-QMC 
techniques to models of electrons coupled to harmonic os- 
cillators or, equivalently, to models of electrons subject 
to time-dependent (retarded) interactions. Such mod- 
els arise in the study of electron-ph onon coupling and 
if dy namical screening is important ( Werner and Millid . 
2OIOI) . 

In Hamiltonian form the quantum impurity model Hqi 
is supplemented by a boson Hamiltonian Hb and an 
electron-boson coupling H^i-b so that Hqi — > Hqi + Hb + 
Hci-B with 

HB + H,i_B=J2g:0''ibi + bx) + Y,^Ab.- (129) 



arguments shows that these interactions may be rep- 
resented in Hamiltonian form by adding a coupling to 
bosons, as defined in Eq. (|129p . 

Solving Hqi + Hb + -ffci-B requires keeping track of 
the bosonic sector of the Hilbert space, which in prin- 
ciple involves an infinite number of additional states. 
Previous approaches to the problem hav e involved ei- 



ther treating the bosons s emicl assically (iBlawid et al 



20031 : iDeppeler and Millisl . 120021 ) or truncating the bo- 



son Hilbert space, retaining only a finite num b er of b o- 



son states (ICapone et al. 



(128) ISangiovanni et al 



.2004 
20061 l2005^ . 



Roller et all l2004al lbl: 



The semiclassical ap- 
proach cannot account for quantal phonon effects such 
as electronic mass renormalization or superconductivity, 
while treating the boson Hilbert space directly adds very 
considerably to the computational overhead and there- 
fore limits what can be done. 

Two approac hes have been us e d in t he CT-QMC 
context. One ( Werner and Millie . 2007b) is based on 
a canonical transformation applied to the CT-HYB 
method and is (at least in its present form) restricted to 
models in which the operators O to which the phonons 
couple commute with the local Hamiltonian. For models 
(such as the single-site dynamical mean field theory of the 
Holstein-Hubbard model or of the dynamically screened 
Hubbard U) which fulfill these conditions an electron- 
boson coupling can be added at essent ially no additional 
comp utation cost. The other method (jAssaad and Langl . 
20071 ) is a generalization of CT-INT to time-dependent 
interactions and can treat more general models, although 
at a substantially higher computational cost. 



Here is the creation operator for a boson mode labeled 
by I/, O'^ denotes a bilinear fermion operator, and uj„, 
are the boson frequency and electron-boson coupling 
constant respectively. In the widely studied "Holstein- 
Hubbard" model, for example, there is just one boson 
mode and the operator O is the on-site electron density. 

An alternative representation in terms of an action 
may be obtained by integrating out the bosons, leading 
to the contribution 



E 

ab 



dTdT'0°(T)W^(T-r')0'(T'), (130) 



with 



sinh(;9cj'/2) ' 

(131) 

(132) 

Conversely, models of electrons subject to time- 
dependent (retarded) interactions are defined by an ac- 
tion involving a term such as S^ct and reversing the above 



B. CT-HYB 

In models where the oscillator degree of freedom cou- 
ples to a conserved quantity of the local Hamiltonian, 
the phonons can be decoupled from the electrons by a 
ca nonical transforma t ion of the sort originally introduced 
by iLang and Firsovl (|l962l ) . By using the transformed 
variables to evaluate the trace over the phonon states, 
the hybridization expansion can be p erformed with neg- 
ligible extra computational overhead ( Werner and Millia . 
2007bl ). 

We present the idea in the context of the single-site dy- 
namical mean field description of the Holstein-Hubbard 
model, for which the local Hamiltonian may be written 
as 



H\oc = -M(?it + 'H) + Unfni 



+V2A(nt + - 1)^ + y (-^' + • (133) 

Here the boson coordinate X and momentum P satis- 
fying [P, X] — i are related to the familiar boson cre- 
ation and annihilation operators hy X = {b^ + b)/\/2 
and P = {b^ — b)/i\/2, and the in the coupling term 
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and the notation of the coupling constant as A arc con- 
ventional. 

Following ( Lang and Firsov . 1962() . the boson and 
fermion operators may be decoupled by shifting X by 
Xq = (^/2X/ujQ)(n^ + nj, — 1). The shift is accomplished 
by the unitary transformation e^^^o so that the Hamil- 
tonian H\oc = e^^-^" H\oce~'^^^° becomes 



Hloc 



Ofi^n^ + '^iX^^ 



p2 



(134) 



-ffioc is of the Hubbard form but with modified chemical 
potential ji and interaction strength U, where 

p.^ H - X^/oJo, (135) 
[/ = [/- 2AVwo- (136) 

The impurity electron creation and annihilation opera- 
tors are transformed according to 

dt = e'^^^"4e-^^^" = e^^^'-'Ul (137) 



iPX, 



-,-iPXo 



(138) 



and this factor propagates into the hybridization. 

After the transformation, the electron and boson sec- 
tors are decoupled and the expectation value {■■ •)b be- 
comes the product of a term involving electron operators 
which is analogous to that computed for the Hubbard 
model without phonons, and a phonon term which is the 
expectation value of a product of exponentials of boson 
operators. The total weight of a configuration 

M{0^{n)}) ^ Wb{{0.dn)})wKuhb.rd{{0,in)}). (139) 

Here, WHubbaid is the weight of a corresponding configura- 
tion in the pure Hubbard impurity model (with parame- 
ters modified according to Eqs. ()135p and ()136p while the 
phonon contribution is 



wtimn)}) 



gS2,.A(T2„)gS2„ 



with < Ti < T2 < . . . < T2n < Si 



oSlA(Ti) 

' b 

(140) 
1 (-1) if the 



A{t) 



operator is a creation (annihilation) operator and 



''b). The expectation value is 



to be taken in the thermal state of free bosons, 
standard disentangling of operators leads to 



The 



Wbi{Oi{n)}) = exp 



J2 s^s.le""^'-^^'-^^" + e"°(^--^^)} 



2n>i>j>l 



(141) 

The phonon contribution can be interpreted as an in- 
teraction K(t — t') between all pairs of operators (see 
Fig. (HH) and (jWerner and Millisl . |2010[) ) of the form 



cosh[a;o(r - /3/2)] - coshK/3/2] 
''^") = -^ sinhN/3/2] ' ^'^'^ 
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FIG. 14 Illustration of an order ti = 4 diagram for the 
Holstein-Hubbard model. Empty (full) circles and squares 
represent V'^ {V) hybridization events. Dashed lines indicate 
interactions K( t) connecting all pairs of hybridization events. 
Adapted from (| Werner and Millisl . [2OI0I ') . 



0.07 
0.06 
0.05 
0.04 
0.03 
0.02 
0.01 



U/t=6 





20 40 60 80 100 120 140 
perturbation order k 

FIG. 15 Distribution of perturbation orders in converged 
single-site DMFT solutions of the half-filled Holstein-Hubbard 
model with a semi-circular density of states with bandwidth 
At, phonon frequency uoq = Q.2t, inverse temperature pt = 400 
and values of electron-electron [U) and electron-phonon in- 
teraction strength (A) indicated. Both in the insulating 
[U/t — 6) and metallic (U/t = 4) phases, the distribution 
shifts little as A is increased except near the phase boundary 
to the bipolaronic phase {X/t = 0.6, U/t = 4). 



which is the twice integrated retarded interaction 
(Eq. I13ip produced by the phonon coupling. The inclu- 
sion of phonons (or more generally any operator which 
commutes with i/ioc) is thus possible without any trun- 
cation and with negligible extra computational cost. 

Figure [T5] presents statistics on the perturbation order 
for a DMFT simulation of the Holstein-Hubbard model 
with semicircular density of states with bandwidth At and 
phonon frequency luq = 0.2t. The average perturbation 
order is seen to be little affected by the strength of the 
phonon coupling. Additional result s on the Holstein- 
Hubba rd model may be found in ( Werner and Milli^ . 
2007bl) . 

A closely related method has also been applied to 
study the consequences of dynamical screening of the 
Hubbard interaction by other degrees of freedom in the 
solid. Screening leads to a retarded interaction of the 
form of Eq. ([T30| with the on-site density and W" 
determined by the dynamical charge susceptibility of the 
other degrees of freedom in the solid. The passage back 
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3Jt/8 




FIG. 16 Single particle spectral function of the one dimen- 
sional Holstein model, computed as a function of frequency 
using "CDMFT" cluster dynamical mean field methods on 
an Lc = 12 cluster at filling n — 1/4 at high temperature 
(left panel) and low temperature (right panel). The spec- 
tra reveal a temperature dependent line broadening and the 
appearance at low T of a near-Fermi-level structure associ- 
ated with the dev elopment of intersite correlations. From 
Ref. l|Assaadl . [200i ). 



to Eq. (|129p provides an oscillator representation and the 
formalis m described above may b e applied. Details are 
ffiven in (jWerner and Millid . [ioioh . 



C. CT-INT 



Assaad and Land (|2007t) showed that the CT-INT ap- 
proach, too, allows a transparent treatment of phonon 
degrees of freedom. Their algorithm enables the simula- 
tion of wider classes of models than the canonical trans- 
formation approach but at much greater computational 
expense. To date it has been formulated fo r the Holstein- 
Hubbard model and applied (Assaad, 20081 ) to cluster dy- 
namical mean field studies of the one-dimensional Hol- 
stein model, and we follow this formulation in our de- 
scription below. The formalism however appears to apply 
also to non-Holstein couplings and further investigation 
along these lines would be of great interest. 

The treatment begins from an action formulation, with 
an interaction term which Assaad and Lang write as S* = 



Sjj + Sw with Sjj the usual Hubbard interaction and 

Sw = y^ dTdT'[n,{T) - 1]W{t - r')K(T') - 1]. 
^ Jo 

(143) 



with W given by Eq. [ml As in other CT-INT calcula- 
tions, it is advantageous to introduce auxiliary fields in 
the interaction terms to eliminate a trivial sign problem. 
Assad and Lang choose 



rP (j 



(144) 



with (T the physical spin, s = ±1 an auxiliary spin 
and aa{s) ~ 1/2 + as6, with 6 some constant (see also 
Sec. IIILA]) . The phonon term is shifted as 

Sw^ drdr' ^ ^ W{t - r') 

•^0 cr,(T' s=±l 

X [rii^criT) - a+(s)] [ni,a'(T') - a+(s)] . (145) 

Assaad and Lang then perform an expansion of the CT- 
INT type, but employing a general vertex 



V{t) = {i,T,a,T',a',s,v} , 



(146) 



where u enumerates the vertex types Hubbard {i> = 0) or 
phonon {v = 1). The sum over the available phase space 
becomes 



V{t) i,(T,(T',S,I/ " 



(147) 



and the weight of the vertex is 

w [V{t)] = <S,,o^(5(t - r') + 6,sW{t - t'). (148) 

Using furthermore the notation 

H[V{t)] = (5,,o<5<x,t'5.',i<5(T - r') 

K(t) - a+{s)] K(r) - a_(s)] + 

5„j [nair) - a+{s)] [ni^^'{T') - a+{s)] , 

the partition function can be written as 

X w[Vn{Tn)]lT,H[Vi{Ti)]...H[V,,{T,,)] 

(149) 

The Monte Carlo procedure follows the scheme described 
in Chapter mil with addition and removal of general ver- 
tices. 
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In a cluster dynamical mean field calculation of the 
one-dimensional Holstein model (Eq. (jl33p with U = 0) 
the method reveals interesting near-Fermi-level struc- 
tures in the electron spectr al function rel ated to inter- 
mediate range correlations (jAssaadl . 120081 ): see Fig. [12] 
for representative results. 

The flexibility of the method, which seems applica- 
ble also to CT-HYB and CT-J, and the importance of 
electron-phonon couplings in materials science suggests 
that the implementation and investigation of more gen- 
eral types of electron phonon couplings would be a worth- 
while effort. 



VIII. EXPANSION ON THE KELDYSH CONTOUR: 
REAL-TIME AND NON-EQUILIBRIUM PHYSICS 

A. Introduction 

In this section we describe diagrammatic Monte Carlo 
techniques capable of computing the real-time and 
nonequilibrium properties of quantum impurity mod- 
els. These methods have been used to calculate the 
transport properties and relaxation dynamics of current- 
biased quantum dots and as impurity solvers for dy- 
namical mean field studies of the nonequilibrium prop- 
erties of solids. Real-time CT-QMC methods were pi- 
oneered by Miihlbacher and Rabani who used a hy- 
bridization expansion method to study a problem of 
electr ons coupled to phonons (jMiihlbacher and Rabani . 
20081 ). The non-equilibrium hybridization expansion 
was gene ralized to the case of el ectron-electron interac- 

2QQ93), 



u=o 
v>o 



u>o 



v>o 



tions in (ISchmidt et all 12008). (IWerner et all 
(jSchiro and Fabriziol . |2Q09[) . (|Schir6l . |2Q10[) . while the 



real -time version of the C T-AUX metho d was given 
in (IWerner et all l2009bl) . (IWerner et all l2010l ) and 



Tsuji et al\ . I2010f ). It is important to bear in mind that 
unlike in the equilibrium case, where the algorithms have 
been tried, tested and optimized, the nonequilibrium ex- 
tensions of CT-QMC are still in an experimental stage. 
The methods which have been implemented so far are 
more-or-less straightforward adaptations of the equilib- 
rium CT-QMC algorithms. Significant improvements 
may be possible. 

While both CT-AUX and CT-HYB based meth- 
ods have been studied, we will restrict our explicit 
discussion in this section to the CT-AUX algorithm, 
which has allowed an accurate study of the steady- 
state current-voltage characteristics of half-filled quan- 
tum dots in the weak- and intermediate-correlation 
regime. For a discussion of the real-time CT-HYB 



used in (lEckstein eii aLl . l2009l l2Q10HEurich et all 12011 
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both CT-AUX and CT-HYB real-time algorithms are 



FIG. 17 Illustration of the Keldysh contour for a CT-AUX 
study of the Anderson model with interaction quench (top 
panel) and voltage quench (bottom panel). In an interaction 
quench starting from U — 0, the imaginary time branch of 
the contour is shifted to t = —00 and need not be explicitly 
considered in a weak-coupling Monte Carlo simulation. The 
red arrows represent auxiliary Ising spin variables. The top 
panel shows a Monte Carlo configuration corresponding to 
perturbation order n+ — 2, n_ = 2, and the bottom panel 
a configura tion corresponding to n+ — 3, n_ = 2, — 2. 
From Ref. (|Werner et all\201(i ). 



presented in detail. 



B. Keldysh formalism 

The basic theoretical task is to evaluate the expecta- 
tion value of some operator O at some time t, given that 
the system was prepared at time i = in a state de- 
scribed by the density matrix po- Using the Heisenberg 
representation the expectation value may be expressed 
mathematically as 

{Oit)) =Tr\poe'fodt'Hit')(J^-^f^dt"Hit")^ 

(the generalization to operators with multiple time de- 
pendencies is straightforward and will not be written ex- 
plicitly) . A nonequilibrium situation may arise through a 
time dependence of H (as occurs for example in a system 
'pumped' by a laser), through nonequilibrium correla- 
tions expressed by po (as occurs for a quantum dot with 
current flowing across it) or through an initial density 
matrix po which is different from the long-time (thermal 
equilibrium) limit, as occurs if a system is 'quenched' into 

a different st ate. 

One may ( Kadanoff and Bavm . 1962f ) view the expec- 
tation value in Eq. (|150p as an evolution on Schwinger- 
Kcldysh contours, examples are given in Fig. 1171 In each 
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panel the upper contour represents the evolution from 
initial time t = to measurement time t via e~'^*, the 
operator O is positioned at the bend where the lower and 
upper contours meet, and the lower contour represents 
the evolution from t back to t = via e*^*. 

The two panels of Fig. [17] also indicate two ways to 
prepare the initial state of the system. The upper panel 
indicates the standard approach, which wc call the in- 
teraction quench. In this approach one imagines that at 
times t < the interactions are turned off, so that po 
is the density matrix of the non-interacting, but poten- 
tially non-equilibrium system. At i = the interactions 
arc turned on, and one studies the subsequent evolution 
of the system. The lower panel indicates an alternative 
approach, the voltage quench. In this approach one pre- 
pares the system by performing an equilibrium simula- 
tion of the interacting model (accomplished formally by 
propagating along the imaginary branch of the contour 
shown in the figure) and then turns on the noncquilib- 
rium effects at time t = 0. 

The general strategy for evaluating Eq. (|150|) is the 
same as in the equilibrium case, namely to write 7? as a 
sum of two terms: one. Ha, for which the time evolution 
can be treated exactly and another, Ht, which is treated 
by a formal perturbative expansion. The expansion in 
Hb generates diagrams which are sampled stochastically, 
using an importance sampling which accepts or rejects 
proposed diagrams on the basis of their contributions to 
(O), with for example O = 1. Time plays the role of 
/? - l/T. 

There are crucial differences. In equilibrium calcu- 
lations, the expansion can be formulated on the imag- 
inary time axis < t < l/T as an expansion of 
TrTi-e"^^" exp[— J^^ drHbir)]. Thus one can work with 
real (or Hermitian) quantities and only one exponential 
must be expanded. In the noncquilibrium situation one 
must expand two exponentials, doubling the perturba- 
tion order required to reach a given time. Also, the re- 
sult of a measurement at a finite time depends on the 
initial preparation of the system. It is thus essential that 
the computation proceed for long enough to build up 
the correct entanglement between the impurity and the 
bath before steady-state quantities are measured. The 
main difficulty of nonequilibrium calculations, however, 
is that the expansion must be done for real times, so dia- 
grams come with factors of i raised to powers determined 
by the perturbation order. The terms in the expansion 
are complex so a 'phase' problem exists, but in all cases 
known to date the expansion can be arranged so that all 
terms are real. A sign problem however remains. Conver- 
gence of the perturbation theory is thus oscillatory rather 
than exponential and the result is a sign problem which 
severely limits the maximum perturbation order that can 
be attained and hence the maximum time which can be 
reached. 



C. Real-time CT-AUX 

Here we present the formalism needed for a nonequilib- 
rium application of the CT-AUX method. For simplic- 
ity we focus on a nonequilibrium version of the single- 
impurity Anderson model, Eq. ([T2|) . where the local 
Hamiltonian is coupled to two leads ( "left" and "right" ) 
which may be at different chemical potentials fia- Thus 
the bath and hybridization terms in the Hamiltonian be- 
come 

i?bath= J2 E«--A^")<-<- (151) 

a=L,R p 

a—L,R p.cr 

A crucial parameter is the level broadening 

r-ico)=7Tj2\Vp^\H{uj^e;) (153) 
p 

associated with lead a. The total level broadening is 

r = r^ + r'^. (154) 

r is the imaginary part of the real axis hybridization 
function. It plays a crucial role in what follows so we 
identify it by a separate symbol. 

In nanoscience applications one is interested in the cur- 
rent flowing through the impurity. The flow of charge 
into, say, the left lead may be determined from the time 
derivative of the number of left lead electrons = 
Tlipa'^paO-piy- Taking the commutator of with the 
Hamiltonian shows that the current flowing through the 
impurity into the left lead is determined by the t ^ t' 
limit of the quantity 

A{t,t') = E^/^ (Tcc^i(t)d<,(t')) . (155) 

pa 

Tc is the contour ordering operator, which exchanges the 
product A{t)B{t') of two operators if t is earlier on the 
contour than t' (a minus sign is added if the exchange 
involves an odd number of Fermi operators). Finding an 
efficient means of measuring A is an important part of 
the algorithm. 

In the nonequilibrium Anderson model an interaction 
quench corresponds to taking [/ = for times t < 
with an instantaneous step to a non-zero U aX t ~ Q 
while the chemical potential difference is time indepen- 
dent and the initial density matrix that appropriate to 
nonintcracting electrons in the given bias voltage. A 
voltage quench corresponds to taking /i^ = for time 
i < with an instantaneous step to a nonzero fJ-L — fJ-B. 
at t = 0. One assumes that the lead electrons equili- 
brate instantly to the new chemical potential so that the 
equal time correlators of lead operators are {Cpj^Cp, = 

<5a,/35p,p'(5„y /T„(ep^<, - /ia), with frix) = (e^/^ + l)-^ 
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the Fermi distribution function for temperature T and 
/^Q the value of the chemical potential for lead a at the 
appropriate time. 

A compact derivation of all measurement formulae for 
both voltage and interaction quenches may be obtained 
from manipulations of the "partition function" (more 
precisely, an expression for the expectation value of the 
operator O = 1) on the contour shown in the lower panel 
of Fig. (HIl): 

Z = e--^3Tr[e-^(^bath+^Sot+-f^hyb+-Wfj--ff^//3) 

X e-**(^Sath+^3ot+ffu+-ffhyb-A:t/t)j^ (155) 

The notation indicates that on the real-time por- 

tion of the contour the two leads may have different chem- 
ical potentials, whereas iJ^ath means that on the imagi- 
nary time portion of the contour the two leads have the 
same chemical potential. At this stage Kp and Kt are 
arbitrary constants. Convenient choices for Kp^t will be 
discussed below. 

In Eq. (|156p the interaction U on the imaginary 
time branch need not be the same as the interac- 
tion U on the real-time branches. The generalization 
to time-dependent U{t) or /Xf pft) i s straightforward 



(jEurich et all I2OIII : iTsuji et all I2OIOI ). In the voltage 
quench U = U while in the interaction quench U = Q 
and the imaginary time portion of the contour drops out 
of the problem. 

The time evolution along the real-time and imaginary- 
time contours is expanded in powers of Hu — Kt/t and 
Hij — I /?, respectively. Each interaction vertex is then 
decoupled using Ising spin variables (x = t or /3) 

Hu-K^/x = -— >' e''=''^"''.1'-"''•^^ (157) 



s=-l,l 



cosh(7:j) 



2x 

1 + (xC/)/(2X,), 



(158) 



as in Eq. (|69| . The resulting collection of Ising spin vari- 
ables on the contour represents the Monte Carlo con- 
figuration {(ti, si), {t2, S2), . . . {tn, Sn)}, with ti denoting 
the position of spin i on the L-shaped contour (see il- 
lustration in Fig. I17p . There are n+ spins on the for- 
ward branch, n_ spins on the backward branch and 
np spins on the imaginary-time branch of the contour 
(n = n+ + ?i_ + Up). The weight of such a configuration 
is obtained by tracing over the dot and lead degrees of 
freedom and can be expressed in terms of two determi- 
nants of n X 71 matrices N^^: 

P{{{tl,Si), (i2, S2), . . . (i„, Sn)}) = 

(-i"- )(i"+)(iftdt/2i)"-+"+ {KpdT/2j3)''^ W dot 

(159) 

N-^ = e^' - {iGo,a){e^' - I). (160) 



Here (Go,o-)y = Go,o-(ii, tj) is the ij element of the nx n 
matrix of non-interacting Green functions 



GoAt,t') = -i{TcdAt)dl{t'))o 



(161) 



computed using the possibly time-dependent chemi- 
cal potentials and evaluated at the time arguments 
defined by the Ising spins. The quantity e^" = 
diag(e'''i''i'', . . . , e'''"*"'^) is a diagonal matrix depending 
on the spin variables (with 7^ = 74 for spins located on 
the real-time branches and 7^ = 7^3 for spins on the imag- 
inary time branch). 

A Monte Carlo sampling of all possible spin con- 
figurations is then implemented based on the absolute 
value of the weights p59p . The contribution of a spe- 
cific configuration c = {(ti, si), (^2, S2), . . . (t„, s„)} to 
the Green function ( G'^) and current (A^) is given by 
(jWerner et all l2009bl ) 



Gl{t,t')^GoAt,t') 

71 

+ i J2 Go.a{t,ti)[{e- 

A^(f,t') = A0,a(t,i') 

71 



I)N„h,jGo.a{tj:t'): (162) 



I)N,],^jAo,a{tj,t'), (163) 



with the first term on the right hand side giving the con- 
tribution to the non-interacting Green function or cur- 
rent and the second term a correction due to the inter- 
actions. In Eq. (|163p 



AoAt,t') = ^l//(Tcc^J(t')d<.(t))o 



(164) 



denotes a dot-lead correlation function of the noninter- 
acting model. The Green function and current expecta- 
tion value is 

G,(i,t') = (G^(t,i')'/'c)/(0c), (165) 
lit) = -2Im^[(A^(t,t)0,)/(0e)], (166) 

(T 

where (.) denotes the Monte Carlo average and (j>c the 
phase of the weight of the configuration c. As in 
Eq. (jl98p . it is advantageous to accumulate the quantity 

n 

X„{si,S2) =i '^c(si,ii)[(e^" - l)Na]i,jScis2,tj), 

(167) 

which is related to the self-energy SbyX*Go = S*G 
(with * denoting a contour convolution). Furthermore, it 
follows from Eq. (|159p that the weight of a Monte Carlo 
configuration changes sign if the last spin (corresponding 
to the largest time argument) is shifted from the forward 
contour to the backward contour or vice versa. Since the 
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absolute value of the weight does not change, these two 
configurations will be generated with equal probability. 
As a result, all the terms in Eq. (|167p which do not in- 
volve the last operator on the contour will cancel. It is 
therefore more efficient to accumulate 

n 

^<t(si,S2) = i(l - <5({ti})) ^ x(.si,i;s2, j) 

n 

+ i5{{ti\) ^ [x(si,last;s2,0 +-'i;(si,/;s2,last)], 

(168) 

I 



with x{si,i\S2,i) = Scisi.tijlie^" - l)N„]i^j5c{s2,tj) 
and 5{{ti]) = 1 if max^ Re(<i) > and otherwise. 

In an interaction quench starting from U = 0, the 
imaginary-time evolution is not explicitly considered in 
the Monte Carlo simulation and temperature appears 
only as a parameter in the noninteracting Green func- 
tions (see Fig. [iTl). Moreover, the latter depend only 
on time differences, and thus can be easily expressed in 
terms of their Fourier transform. Assuming a large band 
cutoff a nd neglecting the real part of the lead self -energy 
we find (|jauho all Il994l: IWerner et all . l2009bf ) 



Go(t,t': 
Ao(<,i'; 



^9- \- ['^ -.c.(«-n r°(c^)(/(^-/ia)-ec(t,^0) 

27r^ (c. [7/2)2 + r2 ' 

a—L,R 

= _9- f — rL(^)rfl(a;)(/(^ - hl) - /(^ - Mfl)) 

V 2^"" (a. - £rf - [7/2)2 -t- r(u;)2 

,0 rUu^Xu; -eg- U/2){f{u: - ^l) - Qc{t, t')) 

27r (tj - - [7/2)2 + r2 



(169) 



(170) 



In the voltage quench, on the other hand, the interac- 
tion is non-vanishing on the imaginary time portion of 
the contour fFig. [T7)). while the chemical potential differ- 
ence jumps instantaneously from zero (on the imaginary 
branch) to V (on the real branches) . Because of the time 
dependence of the chemical potentials, the noninteracting 
Green functions are not time translation invariant and we 
cannot express Go^o- and the dot-lead correlator Aq^„ in 
the form of a Fourier transform. Instead, those functions 
must be computed num erically from t heir e quations of 
motion, as explained in (jWerner et adl2010f) . 



the spin degree of freedom effectively disappears (e'''^'^ = 
— 1) and the algorithm becomes the real-time version of 
the weak-coupling solver (Section IIIip for the particle- 
hole symmetric interaction term Hu — K^jx = [7 (iid,-\ ~ 
5 )("■£(, ^ — 5)- The odd perturbation orders are con- 
tinuously suppressed as approaches — x[7/4. This 
suppression of odd perturbation orders was essential in 
the nonequilibr i um dynami cal mean field calculations of 
(lEckstein aZ.l.l2009ll2010l) and the current calculations 
of (jWerner et a/.l . l2ni(t" 



D. Sign problem 

The sign (phase) problem in the real-time CT-QMC 
methods grows exponentially with the average perturba- 
tion order on the real-time branches, which in turn is 
proportional to the simulation time. Operators on the 
imaginary time branch do not add significantly to the 
sign problem. While accurate results can be obtained for 
average signs down to 10"'^ , this threshold is reached if 
the expected number of operators on the real-time con- 
tour is approximately ten. To reach long times or strong 
interactions, it is therefore important to reduce the aver- 
age perturbation order on the real-time branches as much 
as possible. In this context it is worth noting that in the 
particle-hole symmetric case, the parameters of the 
CT-AUX algorithm can be chosen such that only even 
perturbation orders appear in the expansion. In fact, for 



IX. COMPARISON OF THE EFFICIENCY OF THE 
DIFFERENT METHODS 

1. Average expansion orders and matrix sizes 

For all diagrammatic quantum Monte Carlo algorithms 
discussed here, the computational effort scales as the 
cube of the e xpansion order or matrix si ze, as discussed 



in detail in (jCuU et al\ . l2007t ). For a iHirsch and Fvd 



(171) 



( 19861 ) solver the matrix size is determined by the time 
discretization Ar = /3/N. In the case of the continuous- 
time solvers it is determined by the perturbation order 
k, which is peaked roughly at the mean value (k) deter- 
mined by the probability distribution p{k). In Fig. [181 
we plot these matrix sizes as a function of inverse tem- 
perature /? for fixed U/t ~ A and as a function of U /t for 
fixed pt = 30, for a semi-circular density of states with 
bandwidth 4t. 

It is obvious from the upper panel of Fig. [TH]that the 
matrix size in all three algorithms scales linearly with 
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FIG. 18 Upper panel: Bethe lattice, single site DMFT, scal- 
ing of matrix size with temperature at U/t = 4 for the 
Hirsch-Fye, CT-INT and CT-HYB algorithms. For Hirsch- 
Fye, the resolution A'' = jSU has been chosen as a com- 
promise between reasonable accuracy and acceptable speed, 
while the average matrix size is plotted for the continuous- 
time solvers. Lower panel: scahng of matrix size with U/t 
for fixed fit = 30. The solutions for U < 4.5 are metallic, 
while those for U > 5.0 are insulating. The much smaller 
matrix size in the relevant region of strong interactions is the 
reason for the higher efficiency of the hybridization expansion 
method. From Ref. (|GuU et aZ.1. 120071 ). 



/3. The Hirsch-Fye data are for a number of time slices 
N = /?[/, which is apparently a common choice, although 
Fig. [T^l shows that it may lead to considerable systematic 
errors. Thus, the grid size should in fact be chosen much 
larger {N > 5/3C/). 

While the matrix size in the CT-INT approach is ap- 
proximately proportional to U/t, as in Hirsch-Fye, the U- 
dependence of the hybridization expansion algorithm is 
very different: a decrease in average matrix size with in- 
creasing U /t leads to much smaller matrices in the phys- 
ically interesting region 4 < U/t < 6, where the Mott 
transition occurs in this model. The results in Fig. llSl and 
the cubic dependence of the computational effort on ma- 
trix size show why the continuous-time solvers are much 
more powerful than Hirsch-Fye and why the hybridiza- 



tion expansion is best suited to study strongly correlated 
impurity models with density-density interactions. 

There is of course a prefactor to the cubic scaling, 
which depends on the computational overhead of the dif- 
ferent algorithms and on the details of the implementa- 
tion. However, the results presented here indicate large 
enough difference between the methods that the effects 
of optimization are of secondary importance. 



2. Accuracy for constant CPU time 

The CT-INT, CT-HYB, and Hirsch-Fye algorithms 
considered here work in very different ways. Not only 
are the configuration spaces and hence the update proce- 
dures entirely different, but so also are the measurement 
procedures of the Green's functions and other observ- 
ables. 



Ref. (jGuU et all l2007f ) proposed that the performance 
of solvers should be compared by measuring the accuracy 
to which physical quantities can be determined for fixed 
CPU time. This is the question which is relevant for im- 
plementations and avoids the tricky (if not impossible) 
task of separating the different factors which contribute 
to the uncertainty in the measured results. Because the 
variance of the observables measured in successive itera- 
tions of the self-consistency loop turned out to be consid- 
erably larger than the statistical error bars in each step, 
the mean values and error bars were determined by aver- 
aging over 20 DMFT iterations starting from a converged 
solution. 

The Hirsch-Fye solver suffers from additional system- 
atic errors due to time discretization. These system- 
atic errors are typically much larger than the statisti- 
cal errors. In order to extract meaningful results from 
Hirsch-Fye simulations it is essentia l to do a carefu l (and 
time-consuming ) Ar ^ analysis (jBllimeil . [200I . The 
continuous-time methods are free from such systematic 
errors. 

The high precision of the hybridization expansion re- 
sults for the kinetic energy indicate that this algorithm 
can accurately determine the shape of the Green's func- 
tion near r = and /?. 

For the self-energy. 
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the Matsubara Green's functions have to be inverted and 
subtracted. This procedure amplifies the errors of the 
self-energy especially in the tail region where C/o(iaj„) and 
G(iw„) have similar values. Fig. I19[ in contrast, shows 
low frequency results ImI](icJo)/'^o for [//t = 4 and sev- 
eral values of /3. This quantity is related to the quasi- 
particle weight Z k, 1/(1 — ImE(ia;o)/a;o). Again, the 
Hirsch-Fye results show systematic errors due to the time 
discretization which must be extrapolated. The results 
from the continuous-time solvers agree within error-bars. 
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FIG. 19 Self-energy ImE(za;o)/<^o at the lowest Matsubara 
frequency lvq ~ t^T as a function of j3 for U/t — 4.0. The 
Hirsch- Fye results exhibit large discretization errors, while the 
continuous-time methods CT-INT and CT-HYB agree within 
error bars. CT-HYB is particularly suitable for measuring 
quantities which depend on low-frequen cy components, s uch 
as the quasi-particle weig ht. From Ref. (|GuU et aUbOOTl ). 



but the size of the error bars is very different. The hy- 
bridization expansion approach yields very accurate re- 
sults for low Matsubara frequencies in general. 

The advantage of measuring in Matsubara frequencies 
as opposed to imaginary time in the CT-INT and CT- 
AUX algorithms becomes apparent for large Only 
the difference of G to the bare Green's function Qq has 
to be measured in this algorithm (see the detailed discus- 
sion in Sec. lX.Cl in particular Eg. llQSp . These differences 
decrease with l/a;„ for large w„ and Eq. (|196p yields an 
accurate high frequency estimate, so that the tail of the 
self energy can be computed without amplification of er- 
rors. 

Further discussion of the re lative advantages of differ- 
ent methods can be found in (jGuU et all 120071) . 



X. TECHNICAL ASPECTS 

The following sections, independent from each other, 
are referenced from the algorithm sections and explain 
aspects of updates, measurements, and numerical meth- 
ods needed for the efficient implementation of these 
continuous-time algorithms. Efficient and accurate mea- 
surement, especially of the high frequency behavior, re- 
mains a bottleneck in the computations; further progress 
in this area would be desirable. 



A. Inverse matrix formulas 



determinants of matrices D'^ of size k and 0^^+^ of size 
fc-f 1, 



detD*^ 
detD*^ 



(173) 



with matrices that have one row and one column (some- 
times two rows and two columns) changed, added, or re- 
moved. The only exceptions are given by the hybridiza- 
tion expansion in its general formulation (Sec. IV. C|) . 
which is dominate d by a trace compu tation, and the bold 
CT-QMC method (|Gu11 et alikOKt . where the determi- 
nant structure is replaced by an analytic resummation of 
diagrams. 

To compute the determinant of large matrices directly, 
it is best to first perform a factorization like the LU or 
QR factorization, where the matrix A is written as the 
product of a matrix of which the determinant is known, 
and another matrix where the determinant is easy to 
compute, e.g. the diagonal of an upper / lower triangular 
matrix. The cost of such a straightforward factorization 
is 0(fc3). 

Determinant ratios of two matrices that differ only by 
one or two rows and columns can be computed much 
more efficiently if the inverse of one of the matrices 
is known. This is the reason for computing the in- 
verse Green's function matrix M = in the CT- 
INT algorithm, the inverse hybridization function ma- 
trix M = in the hybridization algorithm, and the 
matrix N in the CT-AUX algorithm. We illustrate the 
linear algebra at the example of the CT-AUX matrix N of 
Eq. ((711) introduced in Sec.|lVl The formulas for Eq. (gS]) 
(CT-INT) and Eq. ^ (CT-HYB) are computed analo- 
gously. 

We start by considering a configuration at expansion 
order fc, characterized by an N-matrix of size fc x fc, 
and consider the insertion of a vertex, thereby enlarg- 
ing the configuration to A: -I- 1 vertices. For ease of writ- 
ing we choose a basis such that the rows and columns 
changed arc the last ones, though of course in the code 
any row/column can be changed. Inserting a vertex into 
the configuration of order k leaves most of the inverse of 
N unchanged (Eq. (|74p): it adds one row (here called R) 
and one column Q to it. enlarging it to a {k+1) x {k + 1) 
- matrix. However, changes to the new N*'"'"^— matrix, 
denoted by quantities with a tilde, are dense: 



k+l\-l _ 



R S 



R S 



(174) 
(175) 



The dominating computational task in most 
continuous-time quantum Monte Carlo impurity 
solver algorithms is the computation of ratios r of 



P is of size k x k, the vectors Q, Q and i?, R have size 
{k X 1) and (1 x fc), and S,S are scalar. A block calcu- 
lation shows that the elements of the matrix N*'^^ may 
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be computed from N'^ , R, S, and Q : 
^=(5-[i?][NWQ])-\ 
Q = -[NWg]5, 

P = N*'^' + [N('^)Q]S'[i?N('=) 



(176a) 
(176b) 
(176c) 
(176d) 



The determinant ratios needed to accept or reject an 
update in Eq. ^7^, Eq. or Eq. are given by 



det(N'^-+i)-: 
dct(N'^-)"^ 



dots' 



dct{S - RN'^Q), (177) 



as can e.g. be seen from an LU decomposition of the 
block-matrix (N'''+i)"^ 

The computational effort for computing the insertion 
probabihty W^y"^ of a spin (or vertex, or segment) is 
O(fc^), or a matrix- vector multiplication followed by an 
inner product, as in Eq. (|176ap . The removal probabil- 
ity is computed in 0(1), as S is an element of N'^+^ 
and therefore already known. If an update is accepted, 
an 0{k^) rank one update has to be performed for 
Eq. (|176dp . As approximately k updates are needed to 
decorrelate a configuration, the overall algorithm scales 
as 0{{k)^), with (k) the average expansion order. Note 
that the acceptance probabilities for vertex insertions or 
removals are more expensive than spin-flips in the case 
of the Hirsch-Fye algorithm (O(fc^) vs. 0(1)), while ac- 
cepted updates require rank one updates (0(/c^)) in both 
cases. 



B. Spin-flip updates 

In CT-AUX, if only the value of an auxiliary spin is 
changed and not the imaginary time or site index of a ver- 
tex, a Dyson equation similar to the Hirsch-Fye Dyson 
equation may be employed. For a spin-flip from inter- 
action Vp^ to Vp^ of spin pq at Monte Carlo step q wc 
obtain: 



Spin-flip proposals are 0(1) (as in Hirsch-Fye), and the 
same linear algebra applies. Spin-flip updates are not 
crgodic in continuous-time algorithms; updates which 
change the expansion order and vertex times are needed. 



1. Delayed spin-flip updates 

Spin-flip updates can be separated into two parts: the 
computation of the acceptance ratio R (Eq. (|182p ). and 
the update of the Green's function after an accepted 
spin-flip move. "Dela yed" updates, a concept devel- 
oped by lAlvarez et al\ (.2008) for the Hirsch-Fye algo- 
rithm and applied to continuous-time methods in 
( 2008h . delay the (expensive and slow) update of the 
Green's function to a later time, while computing a se- 
quence R^, - ■ ■ , j^g""" ' of Monte Car l o spiii -flip acceptance 
ratios. In analogy to Alvarez et al. ( 2008h . we define two 
vectors af and 6j (compare to Eq. I179P as 



A''((NG") 



0\q 



^iPq ) J 



7V« .. 

PqJ 



(185) 
(186) 



C}. „ , which is 

PqPq ^ 



For i?' we need to know (NGq)^ p 
computed by Eq. (fT84)) from N^^p\ " 

At the first step (9 = 1) N = N" is known. We start 
by selecting a spin p^. We then compute R^ according 
to Eq. p82p . and accept or reject the update. 

In a next step [q = 2), we choose the spin p2- In 
order to compute R? 
compute as 



we need to know N^^^^ , which we 



(187) 



More generally, the jth diagonal clement — Nj^ 
after q (accepted) spin-flips is given by 



N. 



r 



(NG")?^. + ((NG")?^^ 



9+1 _ 



7V« 

^3 



A" = 



((NG*^) 

7 



0^9 

iPq 

1 



l + (l-(NGo^'' 



PqPq 



S^pJX'^iNGX,^ 



(178) 

S^pJX'^{N)l,, (179) 



— , (180) 



eKq-^Pq 



1 



= l + il-{NGo)lpX 



(181) 
(182) 



i?' is the spin-flip acceptance ratio. The expression 
(NGo)/„j = G/„j can easily be computed from the iden- 
tity 



iV,,G°,(e^^-l) 



(NGo)i,„ = (iVi^e^ 



<5,„)/(e^" 



(183) 
1) (184) 



(188) 



1=1 



We define two vectors, co^' and row'^, that itcratively 
recompute the elements of the matrix N for the row and 
column pq : 



1=1 

row'' =N° ,+yal b\ = Nl 

J PqJ / J Pq J Pi 



9 

OPq 



(189) 

(190) 



1=1 



These are sufficient to compute the new g-th column 
(row) of the matrices a] (6]) (Eq. (fTHSj) and (fTHej) ) and 
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the new diagonal vector dj : 



Kmpqe""^^ - S.pJ/ie'''^ - 1) - S^pJ 



(191) 



6j = rowp 



9+1 



(192) 
(193) 



As seem in Eq. (|182p . is needed to accept or reject 

the next spin- flip at the next proposed position Pq+i- 
After some steps (jmax we retrieve the full N-matrix by 
computing 



1=1 



(194) 



A complexity analysis shows the cost of the de- 
layed spin-flip updates: Eq. (|189p and (|190p are 0{qk), 
Eq. (flM)) is 0{k), and Eq. pM)) is an 0{k'^q^i,^) matrix- 
matrix multiplication. The reason for performing de- 
layed spin-flip operations instead of straightforward spin- 
flips or insertion and removal updates is that, on current 
hardware architectures, the final matrix multiplication in 
Eq. (|194p is about a factor 10 faster for large (i.e. out-of 
cache) matrices than successive rank one updates, as fast 
matrix operations that reuse data can be employed. The 
additional overhead of computing a, 6, d, rou", and col 
will dominate the algorithm for large q^^^. We there- 
fore recompute N often enough that the overhead does 
not dominate, but that we can still take advantage of 
the matrix operations . In practice omax _ 32 or 64 



are reasonable values ([Alvarez et all 120081 ). also in the 



conti nuous-time algor ithms . For more inform ation see 
also (|Mikelsonsl . mm and (|Gu11 et aLl . boilal) . 



C. EfFicient measurements in the CT-AUX and CT-INT 
formalism 

In the CT-AUX and CT-INT algorithms the Green's 
function measurement formula Eq. ()83p and Eq. (|60|) in 
imaginary time, for sites i and j and at times ti and Tj, 
is 

G^,An-T,) = gl,,{n-T,) (195) 



Measurement using Eq. (jl95p in the imaginary time do- 
main has a crucial drawback: To sample the smooth func- 
tion Gij (r) the formulas need to be evaluated for definite 
T on some grid (which may be chosen non-equidistant). 
Further processing, e.g. Fourier transforms, may intro- 
duce discretization errors caused by this grid. As the cost 
computing G straightforwardly is proportional to the 
number of imaginary time points at which it needs to be 
evaluated, a fine grid of time points becomes prohibitively 
expensive. In addition, G estimated by Eq. (|195p has a 
further drawback: the observable average G{Ti — Tj) is 
translationally invariant, while the estimator explicitly 
depends on two times , Tj , so that translation symme- 
try needs to be restored by the random walk. 

In the Matsubara frequency domain, Eq. (|196p . there 
already is a discrete grid of frequencies a;„ = {2n + 
/ j3, 71 = 0, 1, 2, . . .. The summands inside (•)mc decay 
as A measurement method implemented directly 

in frequency space measures all frequencies up to a max- 
imum cutoff Wmax- To obtain the number of frequency 
points needed we use information from a high frequency 
expansion of the self energy or the Green's function, and 
automatically adjust the cutoff frequency such that sys- 
tematic errors from the cutoff are much smaller than sta- 
tistical (Monte Carlo) errors. This controllability makes 
this method the preferred one for high accuracy measure- 
ments of the Green's function. 

In translationally invariant clusters, only diagonal en- 
tries of the Green's function in k-space are non-zero. For 
a cluster with Nc sites this implies that only Nc inde- 
pendent fc-space Green's functions need to be measured 
(instead of real space Green's functions) , at the small 
cost of performing a (real space) Fourier transform. 

Computing the exponential factors exp(±za;„r) needed 
for the frequency measurement is expensive. Even with 
fast vectorized functions available as part of numerical li- 
braries, these operations are so time consuming that they 
may dominate computer time in large simulations. An 
obvious simplification consists of creating a fine imagi- 
nary time grid. At the start of the simulation, exp(iw„T) 
is computed for all ujn needed and all r on that grid, and 
the exponentials in Eq. (|196p are taken from it. This 
eliminates the expensive calculation of e*""^ at runtime 
at the cost of some (but relatively little) additional mem- 
ory. We did not observe any inaccuracies introduced by 
this discretization. 



where Xp{xq) and Tp (r,) denote the site and time of the 
vertex at row (column) p (q) of M. Fourier transformed 
to Matsubara frequencies, the Green's function is esti- 
mated as 

G^j^^iiUn) ^ g^^^lUJn) (196) 



1. Self energy binning measurement 

An effi c ient m easurement method, presented in 
iGuU et all (|2008al) . is based on measuring EG = S. M 
plays the role of a T-matrix: MQ'^ = SG. This mea- 
surement method works in imaginary time but does not 
have the drawbacks described in the previous section. 
The measurement formula (omitting the spin index) is 
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rewritten as 



E 

pq 



collaborators (jProkofev et al\ . 119981 ) and, among other 
problems, applied to th e attractive - U Hubbard model 



MC 



(jBurovski eraZl . l2006bh . The name "worm" refers to two 



'' I 

pq 

Jo , 



(197) 



dangling Green's function lines in the diagrams that build 
the head and tail of the "worm" . 

Instead of generating configurations of Z and measur- 
ing G, a worm method stochastically samples both the 
series for Z and the series for Gij^^iTi — Tj) simultane- 
ously. To this end the configuration space C is enlarged 
to include both the set of diagrams for Z , Cz and the set 
of diagrams Cq for G: 



C=CzU Ca- 



irn) 



The Matsubara Green's function can similarly be ex- 
tracted directly from the expectation value of S: 

/ 



(198) 



In the Monte Carlo process only the quantity (S')mc is 
measured and binned into fine (typically 10000) bins. 
The cost of this binning process is independent of the 
number of time discretization points of 5, and only re- 
quires the evaluation of M Q*^ at runtime. In practice we 
employ the translational invariance in the time - domain 
to obtain multiple estimates of the Green's function at 
the same step, and perform a matrix-matrix multiplica- 
tion of the matrix Mpq and a matrix Q^j = Gg s i'''q~ ) 
with randomly chosen Tj to obtain estimates for S. The 
method is accurate and significantly faster than the other 
methods presented here and is therefore the measurement 
method of choice for the CT-AUX and CT-INT algo- 
rithms, unless access to large clusters or high precision 
in the high frequency part of the self energy is needed (e. 
g. for analytic continuation), in which case we use the 
frequency measurement. 



D. Green's function ("worm") -sampling 

In all algorithms discussed so far, diagrams or config- 
urations were generated with the weight that they con- 
tribute to the partition function (Sec. III.Cp . For mea- 
surements, Green's function diagrams were then obtained 
by modifying such partition function configurations. A 
priori it is not clear that the Green's function configura- 
tions with large weight are the ones created by modifying 
configurations important for the partition function. If 
the Green's function estimates generated by importance 
sampling of partition function configurations are not the 
ones with large contributions to the Green's function, 
the Green's function estimator obtains a large variance 
and therefore the measurement of the Green's function 
becomes inefficient. This problem may be overcome by 
employing a "worm" algorithm. The concept was origi- 
nally developed in the bosonic context by Prokof 'ev and 



A new partition function of the combined system is de- 
fined by extending Z by the sum over all Green's function 
diagrams, with an arbitrary factor i] that controls the rel- 
ative importance the Green's function sector Cg and the 
partition function sector Cz, 



V 



dTidT2Gij^a{Ti,T2) 



(200) 



In practice, t? is chosen such that both summands for Ztot 
have non- vanishing weight. 

The random walk and updates are modified such that 
the entire space C is sampled: In addition to the parti- 
tion function updates, "worm insertion" and "removal" 
updates, i.e updates that transition between Cz and Cg 
by inserting or removing Green's function operators, as 
well as updates in the Green's function space like the 
shift of Green's function lines or vertex insertions (and 
removals) in the Green's function space need to be con- 
sidered. Updates that change the vertex part of a Green's 
function configuration are important, as they allow im- 
portance sampling for all elements of a Green's function 
diagram. 

Measurements in real space and imaginary time are 
straightforward: A histogram of worm positions with the 
appropriate sign needs to be recorded. Such an imagi- 
nary time measurement yields estimates for the Green's 
function with continuous times. These estimates are best 
measured on a fixed, but preferably non-uniform, grid by 
proposing "worm shift" updates onto measurement loca- 
tions. 

Worm methods have been in iplen i ented both for CT- 
HYB and CT-INT algorithms (|Gu11| . l2008l) . For equilib- 
rium DMFT simulations, without reweighing, the worm 
algorithm did not result in much better statistics than 
the partition function algorithm, as sampling problems 
appear to be minimal. Combined e.g. with "Wang Lan- 
dau" techniques the worm method offers the possibility 
to perform reweighing of the Green's function to obtain 
better statistics. Worm up dates are howeve r crucial in 
a "bold" samphng method ( Guh et al . 2010[ ) where, due 
to a partial resummation of some diagrams, there is no 
direct relation between Green's function and partition 
function diagrams. 
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E. Wang Landau sampling 

In the usual sampling process of the partition function 
CT-INT algorithm, as well as in the other algorithms 
described in Sec. |IV] and |Vl diagrams of the expansion 
(elements of the configuration space) are sampled with 
the weight that they contribute to the partition func- 
tion. Observables are then measured in this ensemble. 
However, we are free to sample any arbitrary ensemble - 
as long as the proper rewcighing according to Eq. ()24|) 
is performed. While the samples generated arc likely 
to have a larger variance [Eq. (f23|) ]. there may be other 
advantages, in particular smaller autocorrelation times. 
Here we describe so-called flat-histogram sampling meth- 
ods. These methods are particularly useful for problems, 
such as flrst order phase transitions, with barriers in the 
c onfiguration space o f t he Ma rkov walker. 

Wang and Landau (|2001allbh presented a general 



rewcighing scheme that is designed to find and overcome 
barriers and phase transitions without prior knowledge 
of where in phase space they ar e. The me thod was ex- 



tended to quantum problems bv lTrover et al\ (|2003[ ). In 
the quantum Wang-Landau method the key is to reweigh 
the perturbation series so that all orders up to some fcmax 
are sampled with approximately equal probability (see 
Fig [^0)1 . fcinax has to be chosen in such a way that all lo- 
cal minima of phase space have some overlap with order 
^max, so they can be reached by flat histogram sampling. 
For the reweighed system, the acceptance ratio ((57)) is 




FIG. 20 Sketch of the histogram of the expansion order for 
flat histogram sampling, s-axis: expansion order k. y-axis: 
histogram Left panel: no flat histogram sampling. Mid- 
dle panel: flat histogram sampling up to half of the maximum 
order. Right panel: flat histogram sampling up to the maxi- 
mum contributing order. From Ref. (Gull . ,200a l 



replaced with R-j^ with 



1, fc > fcjna 



(201) 



where p{k) is the probability of having expansion terms at 
order fc in a non-reweighed sampling. Rewcighing factors 
need also be taken into account while calculating averages 

(Eq. (m). 

The probability p{k) is unknown at the start of the 
simulation. Therefore the rewcighing coefficients are ad- 
justed as the simulation proceeds: the value of A(fc) is 
slightly decreased for the frequently-visited values of fc 
and increased for rarely- visited ones, until the histogram 



is flat. As the ensemble A(fc) does not enter the expecta- 
tion value of the observables, it is not important to have 
a very accurate estimate of it, as long as it is sufficient 
to ensure ergodicity. 

The reweighed algorithm generates diagrams both at 
the physically interesting orders and at orders that are 
very close to zero, i.e. the bare Green's function or non- 
interacting partition function in CT-INT. Deliberately 
generating conflgurations that contribute little weight to 
the partition function may seem inefficient, as the idea 
behind importance sampling is to generate the diagrams 
with the importance they contribute to the partition 
function. However, when revisiting the noninteracting 
case at 0*'' order of the series, all vertices and therefore 
all correlations are removed, and when the series is re- 
built it will likely end up in a different part of phase 
space - for example in a different global symmetry sec- 
tor, thereby avoiding trapping in local minima. In other 
words, the method aims to provide a reduction of the au- 
tocorrelation time for the Markov walker. Closer analysis 
shows that the algorithm can be improved by minimiz- 
i ng the round-trip time between low and high order states 



(jPaval et all \2004 iTrebst et al\ . \2004 



In practice, flat-histogram sampling turned out to be 
very efficient at o btain i ng sy mmetrized, paramagnetic 
Green's functions (jGulll . 120081 ). The fact that most con- 
figurations sampled have low order and contribute next 
to nothing to the observables is compensated by the fact 
that they are quick to sample due to the O(fc^) scaling 
of the matrix operations. 

A further important application of the flat-histogram 
methods is the calculation of thermodynamic potentials. 
The grand potential of the Hubbard model on a finite lat- 
tice at temperature T, for example, can be found by the 
integration of the equation dQ = —Ndfi + DdU, because 
the quantities on the right hand side can be measured 
in a standard simulation. Such a procedure, however, 
requires several simulations for a finite U range. For 
the Hubbard model (with no D MFT self-consis tency) 
a more elegant and efficient way (|Li et all l2009l ) is to 
employ flat-histogram methods to obtain the partition 
function Z directly, as the zero order term for Z is just 
1. Given the rewcighing factors A(fc) and frequency Pk 
with which different perturbation orders have been vis- 
ited during the random walk, the partition function is 
computed as Z = {PoX{0))-^ PkXik). Knowing all 
thermodynamic potentials can be calculated. As an ex- 
ample, we present in Figure [21] the graph for the entropy 
of a 4 X 4 Hubbard cluster. If a converged solution of an 
impurity model is available the same technique may be 
used to compute the impurity model partition function, 
but relating this to the thermodynamic properties of the 
full lattice model requires taking into account the varia- 
tion of the bath density of states. This has not yet been 
explored. 



36 



1.4 
1.2 
S 1 
0.8 
0.6 
0.4 




U = 4 □ 

U = 1 ^ '^a^ 

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 
T/t 



FIG. 21 Entropy per site of the half- filled Hubbard model 
with t — 1, compute d for a periodic 4x4 cluster at different 
temperatures. From (|Li et all |2009| ). 



F. Computation of the trace for general interactions in the 
hybridization expansion 



occupation number basis [Hi^^^Sz] — — [Hi^c.N] further symmetries 
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FIG. 22 Sketch of results of applying rotation / block diag- 
onalization operations to the local Hamiltonian. The Hamil- 
tonian in the occupation number basis (depicted in the left 
panel) is sparse but not blocked. A first permutation oper- 
ation builds blocks according to the occupation number and 
spin of the local Hamiltonian, leading to a Hamiltonian (de- 
picted in the middle panel) which is nearly block diagonal 
with dense blocks. A second (rotation) matrix further re- 
duces block size by considering rotational and translational 
invariance of the impurity Hamiltonian, leading to the sparse 
block structure shown in the right panel. 

In the general formulation of the hybridization expan- 
sion, as derived in Eq. (j93p . the principal computational 
difficulty is the evaluation of the trace of a product of op- 
erators and exponentials of the local Hamiltonian. In a 
given basis this corresponds to taking the trace of a prod- 
uct of 0(fc) (large) matrices that have the linear size nioc 
of the local Hilbert space. Matrix-matrix multiplications 
of matrices with size nioc scale as Oinf^^). It is there- 
fore important to find a way both to reduce the size of 
the matrices that need to be multiplied as well as the 
number of matrix-matrix multiplications that have to be 
performed. 



Computing the 



(iMoler and Loanl 120031 ) is an 



exponential 
is 



of 

expensive 



a matrix 
operation. 



In the following we transform to the eigenbasis of the 
local Hamiltonian by diagonalizing it. In the eigen- 
basis, exp(— iJr) is diagonal. The (formerly sparse) 
local creation and annihilation operators become dense 
matrices. 



1. Block diagonalization 

The local Hamiltonian Hioc has symmetries. While 
these symmetries are dependent on the exact form of 
the local Hamiltonian, usually the total particle num- 
ber A'tot, the total spin z-component ^2 and rotational 
or translational symmetries of the impurity Hamiltonian 



are conserved: [H\oc,Nu 







loc, ^tot\ 



This im- 



plies that the local Hamiltonian may be decomposed into 
a block-diagonal form, containing several blocks with size 
'^biock *C nioc- This procedure is illustrated in Fig. ()22p . 
The advan tage of changing to a block-diagonal form 



(jHauld . 120071 ) is that operators di 



matrix form (see Fig. | 



dj are also in block- 
The operator d\^, for ex- 
ample, raises both the total particle number and the to- 
tal S'z-component by one and therefore consists of off- 
diagonal blocks connecting the (^2 , n) - symmetry sector 
with the (^2 + 1, n -|- 1) - sector. As the most expensive 
part of the code is the computation of matrix products, 
which scales as 0(X;biock "block) or 0{nl^^^ ^j^^J instead 
of 0{n ^„^ the advantage of usinp symmetries is ob- 



vious (|Gull el all l2008bl : iHauld . 120071) 




FIG. 23 Sketch of one of the optimizations in the general 
representation: Four symmetry sectors are drawn, for which 
Sz and A'^ are different. After the trace of d| and is taken 
only one of the symmetry sectors still contributes. In the im- 
plementation, we first identify which symmetry sectors con- 
tribute, and then compute the matrix product and trace only 
for these sectors. Additional symmetries vastly simplify the 
computation. 



A typical example is the four-site Hubbard plaque- 
tte with next-nearest neighbor {t' —) hopping. The local 
Hamiltonian has a size of 256 x 256 elements (4* local 
states). However, H commutes with n^,n^ and has a 
four-fold rotational symmetry (or a couple of inversion 
and mirror symmetries). This allows us to split up the 
256 X 256 matrix into 84 small blocks that have at most 
16 X 16 elements. 



37 



An appropriate basis choice also allows insight into the 
physics. The CT-HYB formalism allows one to deter- 
mine which impurity model states make the dominant 
contribution to the computation of an observable, and 
the matrix of eigenstate occupation probabilities is the 
projection of the density matrix onto the localized orbital 
basis. This information is much more easily interpreted 
if a physically motivated, symmetry-related basis choice 
is made. For examples see Figs. [30l and l42l 



2. Basis truncation 




FIG. 24 Top panel: Binning algorithm: binning of the k oper- 
ators into \/JJ^ bins, each having approximately Vk elements, 
reduces the effort of computing the trace after inserting or re- 
moving an operator to 0(-\/ {k)). Botto m panel: bi nary tree 
for the tree algorithm, 0(log{fc)). (Ref. (|Gul]| , I2008D ') 



As noted in iHauld (120071 ) , in situations where the lo- 
cal Hilbert space is prohibitively large, e.g. in the case of 
large multi-orbital problems or clusters, the computation 
of the trace is only feasible if the size of the local Hilbert 
space is reduced by an appropriate truncation of the ba- 
sis. In systems with very highly excited states that are 
unlikely to contribute (e.g. the 5, 6 or 7 -electron states 
in Cerium) , it is common practice to simply truncate the 
local Hilbert space and eliminate these states entirely. 
The same can be done for high energy / high momentum 
states in clusters. In addition to that, the highest few 
excited states of the local Hamiltonian in a particular 
symmetry sector may be truncated. 

Simple truncation based on some a priori criterion is 
an uncontrolled approximation justified only by a need 
to solve a particular problem with available resources. 
Truncation based on the eigenvalues of the local Hamil- 
tonian only is especially dangerous, as the hybridization 
may broaden and shift levels. Truncation is likely to 
introduce systematic errors. Short excursions into infre- 
quently visited states are often needed to produce transi- 
tions between frequently visited states. For example, in 
the large U Anderson impurity model it is the rare tran- 
sitions into the states with n = 0, 2 that produces spin 
flips. If truncation is to be used, it is advantageous to do 
so in two steps: First one does a short simulation, keep- 
ing as many states as feasible, while keeping a histogram 
of visited states. Even if this simulation is not fully ther- 
malized or long enough to allow accurate measurements, 
it will enable an identification of frequently and infre- 
quently visited states, which may be used to construct a 
truncated Hilbert space for extensive simulation. 

A "dynamic" truncation method that speeds up the 
calculation of the trace without introducing errors in- 
volves checking if exp(— AriJioc) falls below machine pre- 
cision or some other threshold, and if so not computing 
the remainder of the product of that particular part of the 
trace. Unlike in the "static" truncation case described 
above, short excitations into highly excited states are 
still possible, but the computational gain is significantly 
smaller. 



3. Binning and tree algorithms for the hybridization expansion 

The most expensive part of the algorithm is the com- 
putation of the trace, which is linear in the numbers 
of hybridization operators present in the configuration. 
Computing the complete trace in the general case will be 
0(fc), as each operator matrix must be accessed at least 
once. However, recomputing the trace after an opera- 
tor insertion or removal update allows simplifications: A 
first step is trivial to implement and reduces the effort 
to 0{y/k): the operator trace is chopped into around 

(fc) intervals between zero and /?. We then store the 
matrix product of all the operators within this interval, 
such that each sub-interval contains approximately 
operators. If we insert two operators, we will change the 
matrix product of one or two intervals - which need to be 
recomputed at the cost of ^/k operations. The whole re- 
compute operation is therefore of 0{\/k), and a sweep of 
0{k^/'^). This algorithm is illustrated in the upper panel 
of Figure [Ml 

A better, but more complicated algorithm uses 
the properties of self-b alancing binary tre es. AVL 
( Adelson- Velskii and Landis. . . 1962 a, b: Knuth. 19971) trees 
are one possibility. Denoting dense matrices from the hy- 
bridization operators with capital letters and the expo- 
nential vectors p(Ti-|_i — r^) = e^"^^" = Pi.i+i with lower 
case letters, we can write the trace in Eq. (|93)) as 



Tr 



(202) 



and arrange all the operators in (|202p in a binary tree. 
It is easy to see that for every exponential p{t — >■ t^+i) = 
e^oiTi-Ti+i) between the first and last operator we can as- 
sign one of the branches of the tree. These "propagators" 
from time Ti to time r^+i, where a right branch contains 
the propagator from the node to the smallest time of the 
right subtree, and a left branch contains the propagation 
from the largest time of the left subtree to the node (Fig. 
[24|) . The main idea of the algorithm is that each node 
stores products of the matrix product of the left subtree 
times the propagator to the left, the operator, and the 
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propagator to the right times the matrix product of the 
right subtree. This requires an extra storage cost of 0(fc) 
in memory and additional functions for the re-balancing 
of binary trees, but reduces the computational effort of 
a sweep to 0{klogk). 



G. Use of symmetries, global updates 

Global updates are updates that affect many or all of 
the vertices of a configuration. Two simple examples are 
a spin-flip of all auxiliary spins in a CT-AUX simulation, 
and the exchange of all the segments of two orbitals in 
CT-HYB. 

If the update corresponds to an exact symmetry 
(global spin-flips in a paramagnetic system, segment in- 
terchange for degenerate orbitals, . . . ), the weight of a 
configuration remains unchanged and a proposal of the 
global update will always be accepted. In cases with 
exact symmetries the same effect may be achieved by 
enforcing the symmetry at the end of the simulation. 

Global updates are useful for the systems with weakly 
broken symmetries, in particular near a phase transition, 
where they may help to radically reduce autocorrelation 
times. An example are weakly spin-polarized states. In 
all instances considered in the literature so far, global 
updates required the recalculation of determinants to es- 
timate acceptance probabilities, at the cost of 0{k^) op- 
erations. Hence they should be performed at most once 
per (fc) update steps. The concept has proved to be useful 
to descri be an insulating s tate with a small polarization 



i n Refs. (IPotervaev et al. 



(IChan a/.l . l2009l: iKunes et al 




and similarly in 



H. Vertex functions 

For some applications, in particular the determina- 
tion of ph ase boundaries , respo nse functions, or "dual 



Ferm ion" (iRubtsov et a/.l. |2008h "DFA" (iToschi et al. 



2007f ) and other (|Kusunosel . l2006t ISlezak et alihm'^ ex- 
tensions beyond dynamical mean field theory, the expec- 
tation value of observables with four (or even six) cre- 
ation and annihilation operators are needed. An exam- 
ple is r^6crf(^l'T2,T3,r4) = {Trdl{Ti)db{T2)dl(Ti}dd(Ti)) . 

These correspond to reducible vertices. 

Time translation invariance implies that the four-point 
vertex is dependent on three time differences or three fre- 
quencies. Orbital or cluster symmetries of the impurity 
model may further reduce the number of independent in- 
dices abed. Nevertheless, for most these problems the 
number of observables that need to be measured - es- 
pecially for clusters or multi-orbital problems - is over- 
whelming: In a single orbital model, retaining 100 Mat- 
subara frequencies in each of the three momentum in- 
dices requires obtaining and storing results of 10® mea- 



surements. In a four-site cluster calculation, the same 
number would lead to 64 million observables. 

In the CT-AUX and CT-INT algorithms, the four 
point correlation functions are computed using the fact 
that for a fixed auxiliary spin configuration the problem 
is Gaussian and Wick's theorem can therefore be used 
together with Eq. ([55]) . Thus the problem reduces to 
the accumulation of the determinant of a 2 x 2 matrix 



(|Gull et all l2008al: iRubtsov et all 120051) 



{s.,n,.a^Z2) (^14 



with M, 



kl 



(203) 

defined in Eq. ([M)) . If only a few correla- 
tion functions arc measured, Eq. (|203p is best evaluated 
at run-time. If many or all correlation functions have to 
be measured at rir time points and the size Um of M 
is comparatively small, it is advantageous to accumulate 



only (A'// 



and {Ml;' 



r,.x,} j^{s,,T,,x,}s^ and recon- 



'ij / kl 

struct the correlation function at the end of the com- 
putation. While binning the latter expression is 0{n^) 
in memory, it is only 0{n\j) computationally (using the 
time translation symmetry). 

For larger problems, in particular cluster problems, 
Gah(wi,W2), the instantaneous single-particle Green's 
functions, are computed directly in frequency (and in 
DCA: momentum) space for a given spin configuration. 
The four-point functions are then obtained by comput- 
ing the Monte Carlo average of the instantaneous Green's 
function products Tabcd = {GabGcd - GadGcb) and using 
symmetries to reduce the number of observables. For 
many problems, only some of the four-point functions 
are needed (e.g. only the ones with energy transfer 0, 
or diagonal cluster momenta in DCA). Direct frequency 
measurement allows selective measurement of only these 
observables. 

In the CT-HYB, configurations with four local oper- 
ators are generated by removing two hybridization lines 
from configurations of the partition function, similar to 
how Green's function configurations are generated by re- 
moving one hybridization line. 

20091 : 



Some app l icatio ns (see, e.g., (jSlezak et ah . 
Toschi et ah . l2007t )) require the computation of 



ducible two-particle quantities. In order to obtain these 
vertices from the reducible ones, Bethe-Salpeter equa- 
tions have to be inverted. This process can be numeri- 
cally unstable, and how it is best done is currently still 
an open question. 

Our experience is that vertices measured directly in 
frequency space by the CT-AUX and CT-INT methods 
are most accurate, so that this is the method that should 
be used. 
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I. High frequency expansions of the self energy 

In many of the CT-QMC algorithms the self energy 
is obtained as the difference between the inverses of the 
fuU (G) and bare {G°) Green functions. Because both of 
these become small at high frequencies while their errors 
stay constant, the errors of the difference of the inverses 
become large and S(a;) is difficult to measure accurately 
for large lu. It is therefore useful to have an analytical 
representation of the self energy at high frequencies. In 
a general N orbital model the self energy is an iV x 
matrix S and its high frequency expansion is 



S(jw„) 



1 



1 



(204) 



The coefficients So,i may be obtained by from the coef- 
ficients in the high fre quency ex pansions of the full and 
bare Green functions ( Pottlioff et al.. . .199% using 



G{iujn) = —Go + 



Gi 



1 



rG, 



(205) 



the analogous equation for and for 5] = ^° — G^^. 

The coefficients in the high frequency expansions of G 
and 5" are in turn obtained from the discontinuities in 
the derivatives of G and 0" across r = as 



G„ = a(")G(T = 0+) a(")G(T = 0-). 



(206) 



The time derivatives themselves may be computed from 
the definition 

G"^(r-r') = -(T,dJr)4(r')) (207) 

by performing a small-time expansion of 

da{T) = e^^dac-"^ (208) 

and its conjugate. The structure of the second derivative 
term is simplified by exploiting time translation invari- 
ance to place the derivative on the first or second operator 
as appropriate. The result is 



Gf^ ({da, 4})= Sab, 



Gf^({[H,da],dl} 



Gf^({[H,da]. 



H,d 







(209) 
(210) 
(211) 



The coefficients for are obtained using the Hamilto- 
nian without the interaction term. 

We illustrate the procedure for the generic Hamilto- 
nian 



01026162 



jaia26i62^t .t .^^^^^ 



ab 



(212) 



kab 



(fermion antisymmetry implies that jo-ia-^bibi _ 
_ja2aibib2y Important for the self energy are the com- 
mutators with the interaction term, which are (bearing 
in mind the antisymmetry) 



I, da 

Idl 



= 2Eai6i62^""^'^''^<^^i^''- (213) 
= 2Eaia26i^°^°^'^'«rf&i- (214) 



Expanding and comparing terms we find that the con- 
stant term in the self energy is the familiar Hartrce term 



aifcl 



while 



(215) 



(216) 



The expectation values in Eq. (|215p and (|216p must in 
general be measured. 

For the single-orbital Anderson impurity model we find 
(with a chemical potentia l shift of U /2 usually employed) 
(jBhimeil . l2002t IComanad . l2007t iKnechtl . |2003|) 



E(c.) = U {n^a) - 77 + 



— (7i_,)(l- (n_,))+0 

(217) 



Expressions for multi-orbital models wit h density den- 
sity interactions are deri ved in (IGulll. 120081) . for plaq uette 
CDMFT in (lHaulel . l2007tlHaule and Kotliaill2007bli . and 
for multi-orbital models with t he Slater-Kanamori form 
of interactions in (|Wand . l2oToh . 



XI. APPLICATIONS I: DMFT 

A. Overview 

Dynamical mean field theory (DMFT) provided an im- 
portant initial motivation for the development of CT- 
QMC impurity solvers and is perhaps the domain to 
which the new solvers have made the most important 
contributions. We therefore consider dynamical mean 
field applications in some detail. We do not review 
the dynamical mean formalism in detail here, instead 
referri ng the reader t o rev iews of the original "single- 



site" (IGeorges et all Il996f ) and subsequent "cluster" 



( Maier et all l2005a ) formulations (see also ( Potthofj . 



20061 )) and to reviews of the combination of the for- 
malism with modern electronic structure theory which 
provides an important step towards an a b-init i o de- 
scription of s t rongly correlated compounds ()Heldl . l2007t 
Kotliar et all 120061 ). However, for clarity we provide a 
brief explanation of the essential ideas. 

A common strategy in theoretical physics is to obtain 
an approximation to the solution of a problem in terms of 
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a solution of a more tractable auxiliary problem, which 
is specified by a self-consistency condition. Weiss mean- 
field theory and density functional band theory are ex- 
amples. Dynamical mean field theory provides an ap- 
proximate solution of a lattice fermion problem in terms 
of an auxiliary quantum impurity model with interac- 
tion terms specified by the interactions in the original 
lattice model and single particle energies and hybridiza- 
tion functions determined by a self-consistency condition. 
One may think of it as based on an approximation of the 
full self energy E°''(fc,w), which depends on a discrete 
set of orbital labels a, b and continuous momentum and 
frequency variables A;, w, in terms of N functions of fre- 
quency I]j=i.. jv(w) which are the self energies of an A^- 
orbital impurity model. Different 'flavors' of dynamical 
mean field theory correspond to different prescriptions for 
reconstructing the lattice self energy from the Sj(cj) and 
to different forms of the self-consistency condition. All 
formulations require a solution of the quantum impurity 
model which is of high and reasonably uniform accuracy 
over a wide frequency range. It is not unfair to say that it 
is the development of CT-QMC techniques that has given 
DMFT the computational power needed to address the 
full range of problems arising in the physics of correlated 
electron physics. 

The first applications of dynamical mean field theory 
were "single-site DMFT" approximations to the physics 
of model systems such as the one orbital Hubbard model 
and the one orbital Anderson models. (jGeorges et al. 



19961 ) For these two cases the auxiliary impurity model 
is the single- impurity Anderson model (Eq. ([12])) which 
can be solved to sufficient accuracy for most purposes 
by pre-CT-QMC techniques, in particular the Hirsch-Fye 
approach. While the greatly improved efficiency of CT- 
QMC methods has enabled a more refined study of some 
aspects of the physics and has shed light on some spe- 
cial cases, the single-site, single-orbital case has mainly 
served as a test-bed for investigating and evaluating CT- 
QMC methods. 

A second class of DM F T appl ications is the "cluster" 



extensions ( Maier et all l2005a ). which can treat the 



short ranged correlations characteristic of high tem- 
perature superconductors and other low dimensional 
systems. In the single-site DMFT method the self 
energy is replaced by its average over the Brillouin zone. 
Cluster DMFT methods allow for some coarse-grained 
momentum dependence and include some aspects of 
intersite correlations. They arc thus of interest in the 
context of understanding the strong momentum space 
differentiation observed in high-Tc cuprates and other 
low dimensional systems. As of this writing, most of the 
"cluster-DMFT" literature has focused on models with 
a "Hubbard" interaction. For models with Hubbard 
interactions conside rable progres s has been made by the 



2006: Maier et al 



use of Hirsch-Fye (Jarrell et all 1200 It iMacridin et al. 



2005ai bl: 



Vidhvadhiraia et al. 



2009) and exa ct- diagona liz ation(Kancharla et al. 



20081: iKvung et all l2006bt iLiebsch et all I200S : 



Liebsch and Ton bI. I2009I) methods (but see (jKoch et al. 



2008h ). However, the more efficient CT-QMC methods 
have permitted the examination of much wider regions 
of phase space, which has led to new results and insights. 

A third class of DMFT applications is to the study of 
materials such as transitional me tal oxides and actinides 
with partially filled d or / shells.(lHeldl.l2007tlHeld et all 
20061 : iKotliar et all . l2006t iKotliar and Vollhardtl . l2004h 
In these materials multiplet interactions such as Eq. 
are crucial to many aspects of the physics. CT-QMC 
methods have provided the first reliable solvers for this 
class of models and have yielded new insight into their 
physics. 

A fourth class of DMFT applications are extensions 
such as the "dual fermion" and "dynamical vertex" 



approximations (iKusunosd. 12006 



Slezak et al.. 2009: Toschi et al 



.Rubtsov et all l2008t 
2007|). Thes"rTnethods 



require the accurate calculation of the full four-point ver- 
tices of impurity models, and this computationally chal- 
lenging task seems feasible only with CT-QMC methods. 

In the rest of this section we summarize the applica- 
tions in the order presented above, and close with re- 
marks about future challenges. 



B. Single-site DMFT approximation to the single orbital 
Hubbard model 

An important early success of single-site dynamical 
mean field theory was an improved understanding of 
the "Mott" or correlation-driven metal insulator tran- 
sition. This is one of the fundar nental questions in 
electronic co ndensed matter physics ( Imada et aZ.l . ll998t 
otti, Il949l) . The essential physics is captured by the 
one-band Hubbard model, specified by a hopping tij be- 
tween sites i and j and an on-site interaction U : 



(218) 



It has been known for many years (jlmada et all 119981 ) 
that at a carrier concentration n = 1 per site the model 
exhibits a paramagnetic metal to paramagnetic ('Mott') 
insulator transition as the interaction strength U is in- 
creased above a critical value of the order of the band- 
width. The state obtained by doping the large U Mott 
insulating state has many unusual properties. 

A single-site dynamical mean fie ld theory of the Hub- 



bard model was formul ated in Ref. ( Georges and Kotliail 
19921). As shown bv iMiiller-Hartmannl (|l989t ) and bv 
Metzner and Vollhardtl (|l989l ). it 
limit of spatial dimensionality d —, 
be re asonably reliable in o? = 3. (jKotliar and Vollhardt 
200J) corrections are significant in d = 2 and d = 1. 
The corresponding quantum impurity model is Eq. (|12l) . 



becomes exact in a 
oo and is believed to 
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FIG. 25 (Color online) Metal-insulator phase diagram of 
paramagnetic two dimensional Hubbard model in the single 
site DMFT approximation, plotted against normalized inter- 
action strength Ur = with Umit = 9.35i for this 
model. The transition is first order, with coexistence region 
indicated by shading (yellow online). The dashed line indi- 
cates the bad metal/bad insulator crossover determined from 
the condition that the imaginary part of the self-energy at few 
lowest Matsub ara frequen c ies is fl at at the crossover value of 
U. From Ref. (|Park et aU 1200881 ). 



Studies prior to the advent of CT-QMC established that 
in the single-site dynamical mean field approximation the 
phase diagram at half filling involves a first-order transi- 
tion with a critical end-point in the T — U plane and a 
higher temperature crossover regime, as shown in Fig. 1251 
Physics beyond the single-site approximation will correct 
the phase diagram. In two spatial dimensions the change 
is qualitative, but in higher dimension the changes are 
less severe and the single-site phase diagram remains rel- 
evant. The scales are low, presenting a challenge to com- 
putational methods. 

The metal-insulator transition may be characterized by 
the 'kinetic energy', essentially (^ijii-jcl^Cja) , which 
gives a measure of the degree to which electro n motion 
is blocked by the interaction U (jMillid . |2004| ). At low 
T the transition from insulator to metal is marked by 
the appearance of a very narrow band of quasiparticle 
states inside the gap, which itself remains well formed 
for a range of U below the transition. These states form 
a Fermi liquid, but wi th very low Ferm i temperature - 



Theo retical arguments (jFisher et al\ . 119951 : iKotliar et al. 



20021 ) established that the doping driven transition is 
also first order at low T, marked by the sudden ap- 
pearance of states inside the Mott gap. However, the 
transition in this case is only weakly first order and for 
many years proved difficult to observe. These and other 
somewhat unusual features of the phase diagram occur 
because in the single-site approximation the paramag- 
netic ins ulating state has an extensive entropy of In 2 
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FIG. 26 Top panel: Kinetic energy obtained using the indi- 
cated impurity solvers plotted as a function of temperature 
for the Hubbard model with a semicircular density of states 
and bandwidth At and interactions indicated. For U = At 
the model is in a strongly renormalized metallic phase, for 
U — 4.95f a low-T metal to higher T insulator transition oc- 
curs, visible as a jump in kinetic energy at (T/t)^ ~ 0.0007. 
From Ref. (jWerner et all |2006| ). Bottom panel: Doping 
per spin, 0.5 — n, as a function of chemical potential for 
pt = 400 and indicated values of U /t. At this tempera- 
ture, the transition at half-filling (/i = = (7/2) occurs 
at Uc{T) « 5.65. For U > Uc the n = 1 state is insulat- 
ing and shifting the chemical potential induces a first order 
metal-insul ator transition visible as a discontinuity in ?i(/x). 
From Ref. (| Werner and Millisl . [2007al ) . 



enough that the singl e-site phase diagram remain s ex- 
perimentally relevant. (IKotliar and Vollhardti . 120041 ) 

The top panel of Fig. [51] shows results from the first 
CT-HYB study o f the interaction driv en metal-insulator 

20061) . It compares the 



phase transition (jWerner et al 



per site ( Georges et al . 19961 ). In the physical situation 
the entropy will be quenched below some scale, but in 
real three dimensional materials the scales may be low (jWerner and Millisl . l2007al) shows the dependence of car- 



kinetic energy calculated in the single-site dynamical 
mean field theory for the one band Hubbard model 
via a Hirsch-Fye simulation, an exact-diagonalization 
method, and the CT-HYB method. One see that the 
CT-HYB method agrees with the other methods (where 
there is overlap), allows access to very low tempera- 
tures, clearly reveals the behavior associated with a 
strongly renormalized Fermi liquid and captures the first- 
order Mott transit i on. T he bottom panel, taken from 
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rier concentration on chemical potential for interaction 
strengths above and below the Mott transition providing 
the first clear verification that the doping-driven Mott 
transition is first order. Results such as these established 
that the CT-HYB method provides a successful descrip- 
tion even of subtle, low temperature properties of impu- 
rity models. 

Another long-standing question in correlated ele ctronic 
theory was Nagaoka's prediction (Nagaoka, 1966f) of fer- 
romagnetism in the Hubbard model at carrier concen- 
trations very near to half filling and very strong interac- 
tions. The status of this result was unclear for many years 
because Nagaoka's original arguments applied rigorously 
only to one hole in a Mott insulator, not to a thermo- 
dynamic density of holes. Park, Haule, Marianetti and 
Kotliar used the CT-HYB method to establish the ex- 



istence of a thermodynamic Nagaoka phase (jPark et al 
2008b|), at least in the d = oo limit. 



While quantum Monte Carlo methods are most effec- 
tive for imaginary time (thermodynamic) simulations, 
it is of course very important to attempt to obtain 
spectra which can be compared to experimental re- 
sponse functions. The standard method is maximum- 
entro py analytical continuati o n of the imaginary time 



1996[ ). One question of 



data (jJarrell and Gubernatisl . 
particular importance has been the value of the insu- 
lating gap in the strong correlation limit at half filling. 
Here a weakness of the CT-QMC methods reveals itself: 
because the Green function is numerically very small in 
the middle of the imaginary time window, the simulation 
does not visit this region much and the statistics are rela- 
tively poor. But it is precisely this region which is impor- 
tant for the value of the insulating gap. Straightforward 
analytical continuation of the Green function leads to 
broadened gap edges. Ref. ^^eTdi^QM) discusses 
the issue in detail, arguing that one should instead con- 
tinue the self energy and construct the green function 
from the continued self energy. 



C. Cluster dynamical mean field theory of the single orbital 
Hubbard model. 



The single-site dynamical mean field theory ne- 
glects spatial correlations and while it becomes exact 
i n an appropriately de fi ned i nfinite dimensional limit 
(jMetzner and VoUhard^ . Il989t ) it is known to provide 
an insufficient description of the metal insulator tran- 
sition in finite dimensional models. Deviations from the 
single-site dynamical mean field picture are particularly 
large in the case of two spatial dimensions relevant for 
high temperature superconductivity. A striking and still 
ill-understood feature of hole-doped high temperature 
cuprate superconducting materials is the 'pseudogap', 
a suppression of the electronic spectral function occur- 
ring for momentum states along the Brillouin zone face 
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FIG. 27 Comparison of momentum-dependence of the self- 
energy of the two-dimensional Hubbard model with param- 
eters U/t = 4, ^/t = 3.1 and T/t = 0.4 calculated at the 
Matsubara frequency cjo = C = ''^1 P calculated for along the 
cut (0, 0) — (tt, 0) — (tt, tt) — (0, 0) in the first Brillouin zone using 
a numerically exact "diagrammatic Monte Carlo" procedure 
and using CT-AUX simulations of single site and 4, 8, 16, and 
32-sit e DCA DMFT approximations. From Ref. (|Kozik et all 
120101 ). 



but not for states along the zone diagonal. (In electron 
doped cupratcs a phcnomcnologically somewhat differ- 
ent effect, confusingly also sometimes termed "pseudo- 
gap" is now understood as arising from proximity to 
a state with long-ranged two sub l attice antifer r omag- 
netic order (Armitage al\. 2010 : Kvung et al\ . 12004 : 



Motovama et all 120071 : IZimmers et all |2005| )). The 
pseudogap is a dramatic example of the more general 
phenomenon of 'momentum space differentiation': an in- 
crease in the variation of physical quantities around the 
Fermi surface as the insulating phase is approached. Its 
origin and consequences remain hotly debated topics. 

Attention in recent years has focused on clu ster dy- 
namical mean field theories (jMaier et all l2005al ) , which 
capture at least some aspects of spatial correlations. 
These methods have produced a range of very exciting 
results with strong qua litative similarities to the cuprates 
(jTremblav et a/.l . l2006[) )ut are computationally very de- 
manding. To date, cluster dynamical mean field approxi- 
mations have mainly been used to stud y the single-band , 
two dimensional Hubbard and t — J (jZhang and Ricd . 
1988t ) models, although some work on Hubbard- 
l ike models related to heavy fermions has appeared 
(|Sun and Kotliaii 120051 ). Significant results were ob- 



tained with approximate analy tic al and s emi-analytica l 



methods ( Chakrabortv et all l2008t iKvung et 



2008 
2009) 



2006bt iParcollet et all I2004D. exact diagonalization 



(ICivelh et all 120051: iKancharla et oil . l2008t feoch et al . 



Liebsch et all 
and Hirsch - Fye 



20081 iLiebsch and T~ 



QMC (IHuscroft et al 



Jarrell et all 2001 : iLichtenstein and Katsnelson 



2001 



200C; 



Macridin ad . l2006HMaier et ad I2005all2007ari2005b . 
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20061 l2007bl: IVidhvadhiraia et all |2009[) approaches. 
While ED resuhs have been reported only for clus- 
ters up to 4 sites, Hirsch-Fye approaches have been 
extended up to clusters of size 64 ( a t wea k interac- 
tion strength) (IMoukouri and Jarrelli 120011) and 16 
(|Vidhvadhiraia et all |2009[) [at moderate to strong 



plaquette 



interaction strength) although the magnitude of the 
computations required meant that studies were restricted 
to select dopings. 

The advent of CT-QMC methods greatly increased the 
ranges of parameters that could be studied with rea- 
sonable computational resources. Scans of parameter 
space became feasible and phase diagrams have been es- 
tablished. Hybridizat ion expansion metho ds have been 
used to study 2-sit e ( Ferrero et al\, 2009bl la) and 4-site 
(|Gu11 et all l2008bl : IPark et all l2008al ) clusters. In the 
case of these small clusters, the analysis of cluster eigen- 
state occupation probabilities has provided new insights. 
For larger clusters the dimension of the local Hilbert 
space is so large that the hybridization expansion method 
has not been successfully applied. C T-AUX methods 
have been used to stu dy 8 site clusters ( Gull et ai . 2009t 
Werner et all l2009al) and. a.t U = 4i, a range of cluster 



sizes up to 32 ( Kozik et al . 2010l ) 



We present here a few representative CT-QMC cluster 
DMFT results which illustrate the power of the meth- 
ods and the nature of the new results which have been 
obtained. The convergence of cluster schemes with clus- 
ter size is shown in Fig. [27l which compares CT-AUX 
cluster DMFT results are to a direct Monte Carlo eval- 
uation ("diag-MC" ) of diagrams of the lattice problem 



(|Kozik et aZ.I . l2010l ). While "diagMC" as of now only 



works for relatively weak interactions, the results do not 
contain a fc-space discretization of the self energy, so that 
the results are exact within error bars. As seen in Fig. 1271 
for U ~ A, convergence of the cluster DMFT results to 
the exact ones is achieved with 32 sites. 

We now turn to results relating to stronger cou- 
pling physics, beginning with results obtained for four 
site clusters, which have been studied usirig CD MFT 
(|Kothar et aLl . l200l[ ) and DC A (|Hettler et aZ.l . ll998l ) ver- 
sions of cluster dynamical mean field theory. The four- 
site cluster calculations may be thought of as approxi- 
mating the full momentum dependence of the self energy 
by its value at the four points S — (0,0), Py = (0, tt), 
= (7r,0) and D = {tt,tt). 

The physics brought by the added momentum depen- 
dence changes the character of the Mott transition in 
dimension d — 2. Fig. [28] shows the phase diagram 
of the two dimensional Hubbard model obtained in a 
detailed CT-HYB study of the 4-site CDMFT approx- 



imation ([Park et all l2008a|) . It should be compared to 
Fig. [25] which presents single-site DMFT results for the 
same model. The interaction-driven transition was found 
to be first order, as in the single-site case. However, not 
only is the critical interaction strength much less than 




FIG. 28 (Color online) Metal-insulator phase diagram of the 
paramagnetic phase of the two-dimensional Hubbard model in 
the plane of temperature T /t and interaction U /t measured 
relative to the critical end-point value [/mit = 6.05t in the 
4 site CDMFT cluster approximation. Band parameters are 
identical to those used in Fig. [25] Inset: pie-chart histogram 
of occupancy probability of the tw o insulating states at low 
and high temperatures. From Ref. (jPark et al. 

|,[2008i)- 



in the single-site approximation, but the phase boundary 
bends in the opposite direction from that found in the sin- 
gle site calculation, indicating that in the multi-site ap- 
proximation the insulating phase has lower entropy than 
the metallic phase. The narrow band of in-gap states 
whose appeara nce character i zes the Mott transi t ion in 



high dimension ([Fisher et all [1995[ : [Kotliar et all [2002[ ) 



is not found in cluster calculations for 2D systems. 

Insight into the metal-insulator transition is enhanced 
by the ab ility of CT-H YB to provide sector occupation 
statistics ([Hauld . [20071 ). These are indicated in Fig.[28]bv 
pie-chart insets. The low temperature insulating phase 
was found to be characterized by a strongly dominant 
occupation of one state, corresponding to a singlet con- 
figuration of the four electrons on t he plaquett e. This 



correlation was argued by Gull et al. ([Gull et all [2008bl ) 



to indicate that in the cluster dynamical mean field meth- 
ods the metal-insulator transition was driven by the ap- 
pearance of strong short ranged order (most likely related 
to a columnar dimer phase). By contrast, the high tem- 
perature "bad insulator" state, which has entropy of the 
order of ln(2), populates many states of the plaquette 
with significant probability. 

Further evidence of the importance of short ranged 
order was obtained from the electron s pectral func- 
tions ( Gull et "aII. [2008btlPaA" .eLaLl,[2008a|) computed by 
maximum entropy analytical continuation and shown in 
Fig. [29] The insulating state has a gap. The dotted line 
gives the spectral function calculated in a mean field ap- 
proximation based on a two sublattice order; the strong 
similarity indicates that short ranged order is responsible 
for the insulating behavior. 

The left panel of Fig. [50] presents the changes in the 
density of states in the P = (0, tt), (tt, 0)-sector as elec- 



44 




Energy [t] 



FIG. 29 Solid line: on-site spectral function computed for 
different momentum sectors by maximum entropy analytical 
continuation of QMC data for U — 6t and doping x — 0. 
Dashed line: spectral function in the P — (0, tt), (vr, 0)- 
momentum sector. Dotted and dash-dotted lines: P — 
(0, tt), (tt, 0) and local spectral functions obtained by perform- 
ing the DCA momentum averages of the standard SDW mean 
field expres sions for the Gree n function, with gap A = 1.3t. 
From Ref. (|Gull et all\2008H ). 
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FIG. 30 Left panel: Doping dependence of P = (0,7r), (tt, 0)- 
sector density of states obtained by analytical continuation 
of quantum Monte Carlo data obtained from DCA approx- 
imation a.t U — 5.2t, temperature T — t/60 and dopings 
X = 0.04 (solid), X = 0.08 (dashed), and x = 0.15 (dash- 
dotted line). Dotted line denotes the noninteracting density 
of states. Right panel: Evolution of the occupation proba- 
bilities wit h doping at U — 5 .2t and temperature T = t/30. 
From Ref. (|Gull am2008bl '). 



trons are added. The curves are obtained by analytical 
continuation of quantum Monte Carlo data. The 'Mott' 
gap visible in Fig.. [21] has filled in even at the lowest dop- 
ing shown, but for the lower dopings a small 'pscudogap' 
(suppression of density of states) appears near the Fermi 
level while for x = 0.15 the value of the spectral function 
at the Fermi level approaches that of the noninteracting 
model, indicating the restoration of Fermi liquid behav- 
ior, consistent with experiment and with many previous 
theoretical results. 

Examination of the sector statistics shown in the right 
panel of Fig. [30] indicated that the transition from pseu- 
dogapped to Fermi liquid behavior occurred at the dop- 
ing at which the plaqucttc singlet state ceased to dom- 
inate the physics. An intriguing and still open question 
concerns the degree to which the level crossing in sector 
statistics is related to the 'avoided criticality' discussed 



by Haule and Kotliar (jHaule and Kotliarl l2Q07al) . 

Very recently CT-AUX methods have been used to ex- 
amine the larger 8 site cluster shown in Fig. [HU The 
greater efficiency of the CT-AUX method permitted a 
comprehensive examination of the behavior as a func- 
tion of interaction strength, carrier co ncentration, sec- 
ond neighbor hopping and temperature (jCull et all 120091: 



Werner et all . l2009a^ . A striking new result is that both 



the interaction-dependent and doping-dependent metal 
insulator transitions are multi-staged, with different re- 
gions of the Fermi surface are successively gapped as 
carrier concentration or interaction strength are var- 
ied. (Similar behavior was also found in a 2 site clus- 
ter with a clever cho ice of momentum-space patching 
(jFerrero et all l2009al) ). The phase diagram for the 
interaction-driven transition is shown in the right-hand 
panel of Fig. [2U 

Identification of a gapped region in a spectrum can 
be based on analytical continuation. However, obtaining 
data of the requisite quality for analytical continuation 
is very expensive, and analytical continuation is in any 
event a notoriously ill-posed problem. Methods for iden- 
tifying metal-insulator phase boundaries directly from 
imaginary time are therefore valuable. At present it ap- 
pears that the most reliable method is to plot /JG/f (/3/2) 
in momentum sector K, related to the density of states at 
the Fermi energy hy /3GK{l3/2) = J ^ ,J[Z^/"lr)] ■ This 
is shown as a function of interaction strength or chemi- 
cal potential for several temperatures as shown in Fig. 15^ 
The sector gapping transitions were identified from the 
temperature dependence of /3Gk{t — /^/2). One sees 
from Fig. [32] that a gap opens in sector C at lower fi 
than in sector B. Remarkably, this sector-selectivity oc- 
curs on the hole doped but not on the electron-doped side 
of the phase diagram. The successive gapping bears an 
intriguing similarity to the behavior of high- Tc cuprates 
in the pseudogap regime. The interpretation and impli- 
cations of the CT-QMC results are at present the subject 
of active investigation. 



D. Dual-fermion calculations for the single-orbital Hubbard 
model 

Cluster dynamical mean field methods suffer from sev- 
eral drawbacks. A cluster of a given size corresponds to 
a coarse-graining either in real or reciprocal space, which 
may bias the physics. At model parameters relevant for 
high- Tc cuprates, the sign problem limits the range of 
cluster sizes that can be studied, even with CT-QMC 
methods, so that systematics of scaling with cluster size 
has been established only for weak inte ractions (Fig. \27\ 
(|Kozik a/.l . l20ld iMaier et ad l2005al|bl )). 

Alternative w ays to handle non l ocal correlation s have 



been proposed (iKiisunose . 
Slezak etali l2009l: IToscM al 



2006t 
t all \i 



■Rubtsov et all l2008t 
2007[) . These methods 
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FIG. 31 Left panel: Brillouin zone partitioning associated 
with the 8-site cluster DCA approximation with definition 
of the four inequivalent momentum sectors A, B, C and 
D. The noninteracting Fermi surface for t' = — 0.15t and 
density n = 1 is indicated by the gray line. Right panel; 
sketch of the paramagnetic state DCA phase diagram of the 
Hubbard model, calculated for the cluster shown in the left 
panel at half filling, as a function of interaction strength U 
and next-nearest neighbor hopping t' . A Fermi liquid metal 
phase (left, red on-line), a sector selective intermediate phase 
(middle, green on-line) in which the sectors labeled as C are 
gapped but those labeled as B remain gapless, and a fully 
gapp ed insulating phas e (right, blue on-line) are shown. From 
Ref. (|Gull et a;.U2009l ). 




FIG. 33 Spectral function A^^^o.fc at the Fermi level calcu- 
lated for the Hubbard model at i = 0.25, t' = -0.075, U = 
4.0, /3 = 80 and doping x — 0.14 using the lowest order 
momentum-dependent diagram in the dual fermion method, 
with analytical continuation performed by polynomial extrap- 
olation from Matsubara frequencies. An anisotropic destruc- 
tion of the Fe rmi surface in the ps eudogap regime is clearly 
visible. From (|Rubtsov et o/.l . 120091 '). 
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FIG. 32 I3G{^) calculated in DCA approximation to 8-site 
cluster for sectors B (full symbols) and C (empty symbols), 
at U/t = 7 and t' /t = —0.15. The strong temperature depen- 
dence in the sector C curves arises from the Van Hove diver- 
gence in the density of states. The crossing poi nts indicate 
the o nset of gapping in the sectors. From Ref. ijGull et all 
l2009l ). 



are systematic expansions around the single-site DMFT 
approximation, and have the advantage that both short- 
and long-range fluctuations are treated simultaneously, 
but require evaluation of vertex functions. Wc discuss 
here the dual fermion approach where CT-QMC methods 
have been extensively applied; the computational issues 
for the other methods are sim ilar. The dual fermion ap- 



proach (IRubtsov et all 120081 ) is formulated as a standard 



diagrammatic technique in terms of auxiliary, so-called 
dual variables, introduced via a continuous Hubbard- 
Stratonovich transformation. The corrections to single- 



site DMFT appear as diagrams containing the reducible 
vertex parts of single-site DMFT impurity problems at 
nodes, whereas lines are propagators for dual Green's 
functions corresponding to non-local parts of the DMFT 
lattice Green's function. 

Technically, the method requires an impurity solver 
that can provide not only single-electron Green's func- 
tions of the (single-site) impurity problem, but also the 
full four point vertices (also of the single-site impurity 
problem) as a function of all frequencies. The CT-QMC 
algorithms allow such calculations in both the interac- 
tion and hybridization expansion formalisms and have 
been employed for dual-ferniion analyses of the Hubbard 
model. 

In Ref. ( Rubtsov et all |2009^ the pseudogap regime of 
the doped t - t' Hubbard model was studied. A CT-INT 
solver was used to obtain both the Green's function G 
and the 4-point vertex F^^^ in the Matsubara-frequcncy 
domain. The spectral function Ak = —\/'KluiGu,=o.k for 
the entire Brillouin zone is shown in Fig. [33] for 14% dop- 
ing. The phenomenon of momentum space differentia- 
tion is clearly seen: the Fermi surface in the antinodal 
direction is relatively diffuse, whereas sharp quasiparti- 
cles appear near the nodal points. 

20091) to sum 



CT-INT was used in (Hafermann et al. 



the particle-hole ladders in dual diagrams for the half- 
filled Hubbard model, revealing a pseudogap formed by 
antiferromagnetic correlations even in the absence of a 
explicit symmetry breaking. Further investigation of this 
and related approximations is an active area of research. 
Some of the results are presented in Fig. [M] 
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FIG. 34 Momentum-integrated spectral function (DOS) of 
half-filled single-band Hubbard model calculated using the 
dual-fermion method. Left: Metallic (thin lines, red on-line) 
and insulating (heavy line, black on line) spectral function 
within the coexistence region of the Mott transition U/t = 
6.25, T/t = 0.08. Right: DMFT (dashed), lowest-order dual 
fermion correction (thin line) and ladder dual fermion (thick 
line) DOS at U/t = 4, T/t = 0.19. The ladder dual fermion 
result exhibits a pseduog ap of antiferromagnetic origin. From 
(jHafermann et aZ.I. I2OO9I ) . 



XII. APPLICATIONS II: DMFT FOR MULTI-ORBITAL 
AND KONDO MODELS 

Most "correlated electron" materials involve transition 
metal, rare-earth or actinide states with multiply degen- 
erate levels. The electrons in these levels are subject to 
complicated interactions such as the "Slater-Kanamori" 
couplings shown in Eq. (jl3p and exhibit a richer vari- 
ety of physical effects than found in the single-orbital 
Hubbard model, including high-spin to low-spin transi- 
tions, orbital ordering and orbitally selective Mott tran- 
sitions. Until recently investigation of these models was 
hampered by a lack of good numerical methods: there 
were no good auxiliary field transformations, so Hirsch- 
Fye methods could not be used unless the rotational sym- 
metry of the interaction was broken so only the Jz (den- 
sity) component of the spin exchange was retained and 
the "pair hopping" terms in the Slater-Kanamori Hamil- 
tonian were neglected. There were too many states for 
exact diagonalization or NRG methods. The situation 
has now changed. Studies of realistic models of mate- 
rials involving electrons in two or three-fold degenerate 
orbitals are straightforward, 5 orbital problems (i.e. the 
full d multiplet, needed e.g. for the pnictides) are man- 
ageable, and problems involving one electron or hole in 
the 7-fold degenerate / shell arc becoming possible. How- 
ever, a complete single-site DMFT computation for mate- 
rials such as Pu in which the / shell is multiply occupied 
and rotationally invariant (exchange and pair-hopping) 
interactions are important cannot be done with present 
methods: basis truncations (Sec. lX.R2|) or other approx- 
imations (for example the Krylov techniques discussed in 
section IV. Pp are required. The calculations are generally 
done with the hybridization solver because the interac- 
tions are strong and multiple interactions are important 
and so far have been restricted to the single-site dynam- 
ical mean field approximation because the proliferation 
of orbitals means that multisitc models involve too many 



20 



15 



10 



1 phases: 
P & 01 llllllllllll 
F&Ol 
P &02 lllllllllll 
A lllllllll 



03b= 




0.5 



1.5 



2.5 



FIG. 35 (Color online) Main panel: phase diagram of the 3 
band model with semicircular density of states at fit = 50 
and J — U/6 in the plane of particle density n and interac- 
tion strength U. The vertical lines indicate the Mott insu- 
lating phases at integral values of n. The magnetic state is 
labeled by P (paramagnetic), F (ferromagnetic) and A (two 
sublattice antiferromagnetic) while the labels 0(N) denote 
the 3 classes of orbital ordering discussed in (jChan et all 
|2009| ) . The heavy dashed line (orange on-line) gives the 
boundary o f the non-Fermi-liqu id frozen-moment phase dis- 
covered in (| Werner et all [2OO8I ) . Inset: Hartree-Fock phase 
diagram for magnetic phases of the same model. Magnetic 
phase boundaries are indicated by solid lines and orbital or- 
dering boundaries by dashed lines. OO and OS stand for the 
orbitally-ordered and orbitally-symmetric phases respectively. 
All transitions are second order except the FM-AFM transi- 
tion and the o rbital ordering tran sitions at U > 12t and small 
n. From Ref. (jChan et a/.l . l2009t ). 



states to be practical at present. 

Figure 1551 shows the phase diagram (IChan et al. 



20091) 



calculated for a model of electrons moving among three 
degenerate orbitals with the full rotationally invariant 
interactions. The effect of the Hund's coupling on the 
multiorbital Mott transition was determined and a rich 
multiplicity of phases has been found. The orbital degree 
of freedom is important to stabilize the metallic phase 
at relevant interaction strengths (the two orbital model 
with two electrons and J/U = 1/6 is insulating for U > 
3.7t (I Werner and Millisl . [20073 )). Suppressing the L ~ 1 
orbital angular momentum states by applying a crystal 
field rapidly leads to an insulator. 

A remarkable feature of the phase diagram is the line 
indicating an apparent quant um "spin free z ing" transi- 
tion with unusual properties ( Werner et all . |2008| ) . The 
phase exists only in the window < J < J7/3. For 
J = the frozen moment phase does not exist while for 
J > U/3 the term U' - J = U - 3J in Eq. ^ changes 
sign and the physics of the model becomes different. The 
spin freezing transition was originally identified from an 
unusual behavior of the self energy and its nature was 
confirmed by an examination of the local spin and or- 
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FIG. 36 Imaginary time dependence of the spin-spin correla- 
tion function {Sz{0)Sz{t)) (positive correlation function, full 
symbols) and orbital correlation function (ni (0)n2(T)) (nega- 
tive correlation function, open symbols) calculated for a three- 
band model with U = 8t and J = U/6 using CT-HYB at car- 
rier concentrations n indicated. The convergence of the spin 
correlations to a value different from while the orbital corre- 
lation converge to zero in dicates spin but not orbital freezing 
in the model. From Ref. (| Werner et all\200§i ). 



bital correlation functions. 

Figure [35] presents results for the imaginary-time 
impurity-model spin-spin and orbital-orbital correlators 
Cooir) = (O(t)O(O)) with O representing either the 
electron spin density = lJ2a 5(^i,t^"'1' ~ ^L;^".^) 
or the orbital density fia = X^o- '^a o-^a.o-- ^ Fermi liq- 
uid at low temperature T, Css(t) ~ (T/ sin(7rTT))^ for 
imaginary times r sufficiently far from either r = or 
T = 1/T. The DMFT results are consistent with this 
form in the Fermi liquid phase, but in the non-Fermi- 
liquid phase the spin-spin correlator C'ss is seen to ap- 
proach a constant at long times indicating the presence 
of frozen moments whereas the orbital correlator is seen 
to decay rapidly with time on both sides of the phase 
transition. 

The hybrid ization expansion solver yields information 
(jHauld . 120071 ) on which of the different eigenstates of 
Hioc are represented in the partition function. At J > 
at couplings {U > At) only a few states are relevant. 
The large-?/ density-driven transition is marked by a 
change in the dominant states from the one-electron 
states S'=:l/2,L = ltoa nine- fold degenerate man- 
ifold of two electron states with S = 1 and L = 1, with 
the two manifolds becoming degenerate at the transition. 
The interaction-driven transition is on the other hand 
marked by a change in the weight of the two subleading 
states S" = 1/2, L = 1 and 5 = 3/2, L = 0, imply- 
ing a change in the magnitudes of coupling strengths. 
The ability to combine measurements of response func- 
tions with an analysis of which states contribute appre- 
ciably to the partition function is a great advantage of 



the CT-HYB method. This ability has been used in a re- 
cent paper to gain important new insights into the "hid- 
den order" phase of the h eavy fermion material URu2Si2 



(jHaule and Kotliail [ioooh 



A. Heavy Fermion compounds and the Kondo Lattice 
IVIodel 



"Heavy fermion" compounds pose one of the great 
conceptual cha llenges of correlated electron physics 



(Stewart, 19841 ). These materials are intermetallic com- 



pounds in which one element is a rare earth (such as Ce 
or Yb) or actinide (such as U or Pu) with a partially 
filled /-shell, while the other elements contribute s,p,d, 
electrons to broad, weakly correlated bands. The / elec- 
trons are weakly hybridized to the other bands and are 
subject to strong interactions, so that typically one / 
valence state is strongly dominant. At temperatures of 
the order of room temperature, the materials appear as 
two-component systems, with magnetic moments (arising 
from the / shells) embedded in and weakly coupled to a 
Fermi sea of s,p,d electrons. At low temperatures, how- 
ever, the spins and conduction electrons combine into a 
new object, which may become a heavy mass Fermi liq- 
uid or a narrow gap Kondo insulator, or may become 
unstable to unconventional superconductivity, magnetic 
ord er, or may exhibit a variety of quantum c ritical behav- 
ior ( Lohnevsen et all 12007 : Stewart . 2001 ). Our under- 
standing of the heavy fermion state has been hampered 
by a lack of unbiased numerical methods. While numer- 
ics is still far from being able to address the full richness 
of heavy fermion physics, the combination of dynamical 
mean field theory and methods including the CT-QMC 
approach is beginning to have an impact on the field. 
CT-HYB methods have b een applied to the stud: 



of heavy fermion materials (jHaule and Kotliarl . 1200 



I 



Shim et a/.l . l2007^ but it appears at present that difficul- 



ties arising in the course of dealing with realistic mod- 
els of heavy fermions are sufficiently large that CT-HYB 
methods have been mainly used to spot-check the results 
of other, approximate but much less computationally ex- 
pensive, solvers. A realistic treatment of heavy fermion 
materials must deal with the full complexity of the /-shell 
and is characterized by a multiplicity of interactions, all 
of which are strong, a strong spin orbit coupling and 
(in many of the interesting materials) a low point group 
symmetry, leading to a complicated multiplet structure 
imposed on a local Hilbert space of dimension 4^. This 
Hllbert space is too large to treat directly by a straight- 
forward application of the CT-HYB method. However, 
in many if not all cases only a small portion of the Hilbert 
space is relevant to the physics, so truncation schemes in 
which only a portion of the Hilbert space is retained may 
be appropriate. In some cases, such as elemental Ce or 
Ce-bascd heavy fermion compounds the relevant valence 
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states are f^, and perhaps and a straightforward 
truncation in which all higher occupancies of the / state 
are forbidden works well. In other situations, such as 
Pu, more elaborate schemes involving truncation in en- 
ergy, in valence and in size of sub-matrices is required. 
CT-J methods, which in effect reduce the Hilbert space 
of the local problem to the minimum possible size, are a 
promising alternative route. 

In a very interes ting first step in this direction, Otsuki 
and collaborators ( Otsuki et al. . l2009al l^ have used the 
CT-J method to perform a detailed study of the single- 
site dynamical mean field solution of the spin 1/2 Kondo 
lattice model, defined by the Hamiltonian 



Cfeo 



(219) 



while Matsumoto and collaborators (jMatsumoto et all 
20091) have used the CT-J method along with input from 
ab-initio band theory to describe trends across families 
of heavy fermion compounds. 

This model with antiferromagnetic J is a minimal 
model for heavy fermion physics and also may be used to 
address other theoretical issues, for example by changing 
the sign of J. In the physically relevant antiferromagnetic 
J case the model is believed to have a small J magnetic 
phase and a larger J non-magnetic phase: a Fermi liquid 
if the density of the conduction band is different from 1 
per site and a Kondo insulator if the conduction-band 
density is one per site. The qualitative form of the phase 
diagram has bee n understood since the work of Doniach 
( Doniachl . Il977 ). The band theory phase diagram cal- 
culated by the CT-J method is shown in Fig [37l It 
has the qualitative form proposed by Doniach but the 
CT-J method enables one to understand in detail how 
the high temperatur e local moment pha se crosses over 



to the Fermi liquid (jOtsuki et all l2009al ). and provides 



insight into the relation of the Fermi liquid coherence to 
the magnetic phase diagram and allows one to include 
material-specific information. 

Figure [51] shows t he single-part i cle spe ctral function 
A(k,uj) computed bv lOtsuki et all ( 2009al ) using an an- 
alytical continuation based on Fade approximants. This 
continuation method requires data of extremely high pre- 
cision, available only with the CT-QMC methods. The 
vertical white lines labeled ks indicate the positions of 
the Fermi surface defined by the conduction band elec- 
trons in the absence of any Kondo effect; the line labeled 
kl indicates where the Fermi surface would be if the lo- 
cal moment became an itinerant electron and were folded 
into the conduction band. The left panel shows A{k,uj) 
for the high T = 0.25. The spectrum exhibits a behav- 
ior of almost non-interacting electrons at high energies. 
However, a suppression of density of states is seen near 
the conduction electron Fermi surface ks- 

The right panel shows the spectral function at T = 
0.0025, which is much lower than the impurity Kondo 
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FIG. 37 Phase diagram calculated using CT-J methods with 
band theory input for the material family CeX2Si2. The ab- 
scissa t is defined as the Kondo coupling measured relative to 
the cri tical Kondo coupling for the T = magnetic transition. 
From (|Matsumoto et allWO^ . 




FIG. 38 The single-particle excitation spectrum A{k, u}) for 
J = 0.3 and ric = 0.9 at (a) T = 0.25 and (b) T = 0.0025. The 
slanted line represents the non- interacti ng spectrum a ; = k — ij, 
which is realized for J — 0. From Ref. (|Otsuki et a/.l [20"09al 'l . 



temperature defined by Tk = \Jli^~^^^ ^ 0.1 with 
g = 2Jpc(0). Here the spectral function takes a form 
closely resembling that expected if the conduction band 
is weakly hybridized with a very flat band near the Fermi 
level and the Fermi surface has shifted to the point k^, 
indicating that the local moments in fact contribute to 
the Fermi volume. This behavior was expected, based on 
the detailed understanding which has been obtained for 
the single-impurity Kondo problem, but it is remarkable 
to see the phenomenon clearly exhibited in a lattice cal- 
culation. The fact that the bands are well defined at all 
fc, and that the Kondo hybridization gap which opens up 
at Kg is well defined, are new and somewhat unexpected. 

In a related study ( Hoshino et al. . 2010[ ). Hoshino and 
co-workers considered the Kondo lattice model at con- 
duction band densities n = 1 where at larger J the 
ground state is a paramagnetic Kondo insulator. At 
smaller J the paramagnetic Kondo insulator is unstable 
to an antiferromagnetic insulator ground state. Figure [39l 
shows the spectrum for an intermediate J = 0.2 where 
a Kondo insulator phase is established at intermediate 
temperatures (left panel) and at lower T becomes unsta- 
ble to antifcrromagnctism (right panel) . In the region of 
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FIG. 39 False-color plot of electron spectral function in fre- 
quency u} and scaled momentum k plane for Kondo lattice 
model with antiferromagnetic coupling J = 0.2 in (a) the 
paramagnetic phase at T = 0.035 an d (b) the antiferroma g- 
netic phase at T = 0.010. From Ref. (jHoshino et aUl2010l '). 



(a) 7=0.035 




FIG. 40 False-color plot of electron spectral function in fre- 
quency (xJ and scaled momentum k plane for Kondo lattice 
model with ferromagnetic interaction J — —0.2 in (a) the 
paramagnetic phase at T = 0.035 an d (b) the antiferroma g- 
netic phase at T = 0.010. From Ref. (|Hoshino et oU 120101 ). 



the Brillouin zone presented in the figures the form of the 
spectral function is remarkably similar in the two phases; 
the magnetism merely sharpens the spectral function and 
increases the gap size. Again the hybridization of the lo- 
cal moment into the conduction band is the only reason- 
able interpretation of the formation of the paramagnetic 
insulating state. 

It is interesting to contrast these results with those ob- 
tained for ferromagnetic Kondo coupling, shown in fig l40l 
Here we sec that in the paramagnetic state there is a band 
crossing the Fermi level: the material is not an insulator 
because the Kondo ef fect does not oc cur (a similar effect 
was demonstrated in (j Werner and Mi llis. 2006) using the 
CT-HYB method) and it is only when the antiferromag- 
netic instability occurs that a gap opens up. 

The application of CT-J methods to Kondo-like prob- 
lems is still in its early stages, and it seems likely that 
further extensions to more realistic models, and to clus- 
ter dynamical mean field approaches, will yield further 
insights. 



B. Dynamical mean field theory for realistic models of 
correlated materials 

Dynamical mean field methods are more and more 
widely used in ab-initio based studies to model in a re- 



alistic way the properties of interesting materials. These 
studies involve many subtle issues relating to mapping 
the orbitals and energies derived e.g. from a density func- 
tional band theory calculation onto a theoretical model 
appropriate for solution with dy namical mean field m eth- 
ods. The subject is reviewed in (jKotliar et all l2006l) and 
we will not attempt to summarize the discussion here. 
For the purposes of the present review it is enough to 
note that the correlated electron aspects of real materi- 
als typically involve multiple orbitals and several inter- 
action parameters, so a mapping onto a simple one-band 
Hubbard model is typically not appropriate, while the de- 
manding nature of the band theory computations places 
a premium on having efficient impurity solvers for the 
dynamical mean field calculations. The development of 
CT-QMC methods has therefore had a significant impact 
on the field. The range of applications is large and grow- 
ing rapidly; it will not be summarized here. Rather, we 
will focus on recent results pertaining to one particularly 
challenging, and particularly topical system, the iron- 
based superconductors, where CT-HYB methods have 
made an important contribution to understanding the 
physics. These calculations may be considered as reflec- 
tive of the present "state of the art" of the "realistic 
DMFT" field. 

The unusually high superconducting critical tempera- 
tures together with unusual normal state properties are 
generally agreed to place the iron oxypnictides in the 
broad category of strongly correlated superconductors, 
which also includes the k organics, cerium and plutonium 
based heavy fermions, and cuprate high temperature su- 
perconductors. The correlated electrons reside mainly 
on d-orbitals associated with the Fe site and it appears 
to be necessary to retain all 5 of the states in the d- 
multiplet and to treat carefully both the effects of the U 
interaction which constrains charge fluctuations and the 
J-type interactions which select different states at fixed 
total charge. Because the couplings are neither extremely 
large nor extremely small, approximate methods may not 
be reliable: the full interacting problem must be treated 
by a numerically exact method. The low point symme- 
try of each Fe site means that ligand field effects com- 
pete non-trivially with the interaction effects while the 
hybridization function is complicated, and must be de- 
termined using band theory input. From the dynamical 
mean field side the complexity of the problem is such that 
only single-site DMFT calculations have been attempted, 
sometimes with a further restriction to density-density 
interactions. 

In order to investigate the correlation effects in 
such complicated compounds it is important to have 
consistent one-electron and many-body parts of the 
LDA-I-DMFT Hamiltonian. For example, Aichhorn and 
collaborators studied the material LaOi_a;Fa;FeAs using 
an optimized basis of the localized dpp Wannicr functions 
which was constructed from the 22 Bloch bands, corre- 
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FIG. 41 (Color online) Full (all bands) spectral function for 
dpp Hamiltonian description of LaOFeAs. Black line: LDA. 
Red line: LDA+DMFT (compu ted retaining only de nsity- 
density interactions). From Ref. (lAichhorn et al\.\200^ ). 
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FIG. 42 Histogram of occupation probabilities for each 3d 
atomic state in DMFT calculation for BaFe2As2 at T = 
150A'. The states are sorted by total d occupancy and 
withi n each manifold of f ixed occupancy by energy. From 
Ref. (jKutepov et aZ.I . I2OI0I ). 



spending to the 10 Fe-3d, 6 As-p and 6 0-p states (note 
each unit cell contains two formula uni ts and the point 
symm etry of the two Fe is the same) ( Aichhorn et all 
20091 ). The Green's function and hybridization function 
are constructed from the matrix elements of the Kohn- 
Sham Hamiltonian in the Wannicr basis, while matrix 
elements of the Coulomb interactions were calculated 
from the static limit of a constrained random phase ap- 
proximation. The dynamical mean field theory was con- 
structed by retaining the on-site intra-d interactions and 
projecting the fc-integrated Green function onto the sub- 
space of d Wannicr functions. Other groups use slightly 
different procedures; for example Kutepov et al. used 
a self consistent GW procedure to compute the interac- 
tion and an orbital-based procedure rather than a Wan- 
nicr fu nction-based procedur e to define the basis of local 



states ( Kutepov et al . 2010h . 



Aichhorn et al. then used CT-HYB simulations (but 
with only density-density interactions) at room tempera- 
ture to obtain the full local spectral function for the dpp 
Hamiltonian corresponding to the experimental crystal 
structure of LaFeAsO and the r ealistic Coulomb matrix 
elements ( Aichhorn et al. . [looi). Results are shown in 
Fig. im The LDA-hDMFT DOS near the Fermi level 
displays characteristic features of a metal in an interme- 
diate range of correlations. Both occupied and empty 
states are shifted towards the Fermi level due to the 
Fermi-liquid renormalizations. No high-energy features 
that would correspond to lower or upper Hubbard bands 
can be seen in this LDA+DMFT electronic structure. 

In untangling the physics of the materials the ability of 
the CT-HYB method to provide the components of the 
local density matrix, in particular the probability that 
any one of the atomic states of the iron 3d orbital is oc- 
cupied, is important. This is plotte d for the material 
BaFcsAss in Fig. \M (|Kutepov all I2OIOI ). Even the 



most probable atomic states have a probability of only 
a few percent, hence a naive strong correlation atomic 
limit is qualitatively wrong for this compound. The wide 
spread of energies within a given submanifold is a conse- 
quence of the additional "J-like" interactions. 




FIG. 43 Momentum resolved spectral function A(k,u}) cal- 
culat ed for BaFe2As2. Gray inset: ARPES intensity from 
Ref. (|Brouet et oZ.ll2010l ). From Ref. (|Kutepov et allhoid ). 



Figure l43l shows a false-color representation of the mo- 
mentum resolved spectral functio n in th e 



near-fermi-surface energy range kut^.ov et all . M) 



Near the Fermi level the quasiparticle bands are well 
defined, while at higher energies the structures become 
blurred, reflecting the increased phase space for scatter- 
ing. The quasiparticle velocities are renormalized relative 
to the band theory result (not shown) by factors of 2 for 
— and 3z^ — orbitals and 3 for the xy, xz, yz 
orbitals. The momentum space positions of the Fermi 
surface crossings are in good agreement with photoemis- 
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sion results, as are the renormalizcd velocities. Compar- 
ison of these sorts of calculations to the rapidly growing 
body of experimental data are enabling a comprehensive 
understanding of the physics of novel materials. 



XIII. APPLICATIONS III: NANOSCIENCE 

A. Transport through quantum dots: linear response and 
quantum phase transitions 

One important application of quantum impurity mod- 
els is as representations of "single molecule" conductors 



and other nano-devices ( Hanson et all |2007|) . Much of 



the attention in the nanoscience community has been 
focused on weakly interacting systems or on simple 
Hubbard-like dots. Standard perturbative or Hirsh-Fye 
QMC methods suffice for these situations, although CT- 
QMC methods have been used, e.g. in a study of th e 



accuracy of the GW approximation (jWang et al\ . 120081 ) 



As the field moves towards consideration of quantum dots 
with richer physics, other approaches including CT-QMC 
methods arc likely to become important. 

An example is provided by the two-level two lead 
quantum dot syste m uncovered by Ya c oby et al. 



(jYacobv et al\ . Il995h . IGoIosov and GefenI (|2006l ) sug 



gested that this system could display a quantum phase 
transition between two different relative occupancies as 
level energies were varied. Th is issue was investigate d 
using the CT-HYB method by IWang and MillisI (|2010[) . 
In the general case of the model presented by Gefen the 
imaginary part of the hybridization function (giving de- 
cay of the dot electrons into the leads) does not commute 
with the combination of the level Hamiltonian and the 
real part of the hybridization function (giving the renor- 
malization of the dot energies) . This causes a severe sign 
problem, whic h prevented any u seful simulations in the 



general case. IWang and Miilisl \20l(h argued that the 



universal behavior at a quantum critical point (if one ex- 
isted) could be described by a sign problem-free model 
(essentially because at the critical point the combination 
of the dot Hamiltonian and real part of the hybridization 

function becomes the unit matrix) . 

To investigate the criticality, Wang and Millid ( 2010l ) 
considered the imaginary-time dependence of correla- 
tion functions of variables defined on the quantum dot. 
Fig. l44l shows three different behaviors at times ^ (3/2: a 
dependence expected for a Fermi liquid for small U/t, 
a power law at the critical point, and a constant long 
time behavior for large U in the non-Fermi liquid phase. 
However, impurity problems may be characterized by ex- 
ponentially small scales such as the Kondo effect. Distin- 
guishing a very small scale from a true phase transition 
is numerically challenging. The ability of CT-HYB to 
access very low temperatures ~ lQ~^t provides reason- 
able evidence of a critical point. However, for problems 



1<S> 



3. 



0.6 



0.4 



0.2 



r 



U'7t=0.6 
U'7t=0.4 -V- 
U'7t=0.3 ■ 
U'7t=0.2 -H- 
U'7t=0 -e- 



0.2 



0.4 
10'*(T/t)^ 



0.6 



FIG. 44 Imaginary time density-density correlation function 
W of two-level, two-lead model evaluated using CT-HYB at 
midpoint of imaginary time interval and normalized to value 
at T = O.Olt, as function of interaction strength with level 
energies tuned to be equal. Weak T and r dependence is seen 
in non-Fermi-liquid phase ([/ = 0.4, 0.6) and strong T and t 
depende nce in Fe rmi liqui d pha se ([/ = 0,0.2). For details 
see Ref. l|Wang and Millia 120101) . 



such as this where the key question concerns the asymp- 
totic low energy behavior, quas i-analytical fun c tiona l 



renormalization grou p methods ( IKarra sch et al. j_ 2006 ) 



and N RG approaches (jBulla et ah . i2008i : iKarrasch et al 
20071 ) may be more powerful. 



B. Metal atom clusters on surfaces 

An active area of nanoscience research concerns the 
properties of one or more transition metal ions on a metal 
surface. Of particular interest is the density of states, 
which may be compared to sca nning-probe micros copy 
data. Savkin and coworkers in ( Savkin et all [2005 ) ap- 
plied the CT-INT scheme to a model of three interacting 
Kondo impurities on a metallic surface. The ability of 
the CT-QMC methods to treat realistic interactions al- 
lowed an accurate investigation of the interplay of cluster 
geometry, inter-adatom hopping, local Coulomb interac- 
tions and the Heisen berg exchange interactions between 



magnetic impurities. ISavkin et al\ ()2005() showed that a 



rotationally invariant antiferromagnetic exchange inter- 
action is almost twice as efficient in suppression of the 
single site Kondo effect as is the Ising like interaction 
which was all that could be treated by previous meth- 
ods. 

The possibility to make quantitative comparisons to 
experiment highlights the need to incorporate as much 
material speci ficity as possible into the calculation. 
Gorelovl (|2007l) performed a realistic study of Co atoms 
in the bulk or at the surface of a Cu host. They found 
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that a complete treatment of the problem, including all 
inequivalent terms of the Coulomb interaction, was es- 
sential for obtaining physically relevant results. Inclu- 
sion of all of the interaction terms however produces a 
severe sign problem. While the sign problem can be mit- 
igated to some extent by an appropriate choice of basis, 
it severely limits the range of temperatures over which 
results can be obtained. These calculations represent the 
current state of the art: they push the CT-INT tech- 
nique to its limits and demonstrate the need for further 
algorithmic developments. 

To set up the problem, density functional band theory 
techniques were applied to appropriately chosen supercell 
geometries. From these calculations wave functions di{r) 
for the Co d-states and itinerant electron wave functions 
"^nkir) were extracted. The bare local Green's function 
is then obtained as 



E 



(d,i^„fc)(^„fcidj) 



(220) 



while the Coulomb interaction Hint = \ J2ijkiaa' ^ijki 
cl^Cj^iCkcr'Cia iuvolves matrix elements of the form 
Uijki = {di{ri)dj(r2) ^^^f_^^^ dk{r2)di{ri)) . The number 
of interaction terms which must be considered is large, 
and depends on the choice of basis in the d-sector. 

Use of symmetries to rearrange the interaction and 
eliminate redundant terms was also found to be impor- 
tant. Implementing all the symmetries and making an 
optimal basis choice led Gorleov et al to an expression 
for the partition function as an expansion in 129 inde- 
pendent interaction parameters: 
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The expansion exhibits a "trivial" sign problem which 
may be mitigated by appropriate choice of a parameters 
as discussed in Sec. IIII.AI although the multiplicity of 
interactions requires a m ultiplicity of a parameters; for 
furth er details see Refs. (jCorelovl uOOTt ICorelov et all . 
20091) . The expansion also suffers from an 'intrinsic' sign 



problem (not curable by choice of a) whose severity was 
found to depend on the basis choice. "Three orbital" 
terms Uikki with I ^ i were found to produce a severe 
sign problem but do not occur if a spherical harmonic 
basis is used. However, "non-diagonal" terms Uijki with 
(j j,k I) cannot be eliminated by transformations, 
make important contributions to the physics, and give 
rise to a sign problem, of a severity which depends on 
other features such as the Green's functions. 

To test both the expansion and the importance of 
the non-diagonal terms, Gorelov et al. determined the 
Green's function for one orbital of the isolated atom 
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FIG. 45 (color online) Comparison of CT-INT expansion of 
Eq. (|22ip with exact diagonalization for an isolated Co atom 
with (7=1 eV, J = 0.4 eV, at inverse temperature 13 — 2 
eV^^, for 5 electrons (main panel) and 8 electrons (inset). 
Solid lines and crosses: Full Hamiltonian. Dashed lines and 
plus symbols: model spec ified by diagonal terms only. From 
Ref. (|Gorelov et a/.ll2009l ). 




FIG. 46 (color online) Left panel: Total DOS of 3d orbital of 
Co atom embedded in Cu matrix. Model parameters: J7 = 4 
eV, J = 0.7 eV, /3 = 10 eV"^ for 5-orbital impurity with 7 
electrons. Right panel: Total DOS of 3d orbital of Co atom 
embedded in the bulk of Cu, into 1-st layer and Co-adatom on 
the Cm(111) surface. Model parameters: (7 = 4 eV, J = 0.7 
eV, P = 10 eV~^ for 5-orb ital impurity with 7 electrons. From 
Ref. (|Gorelov et a/.ll2009l ). 



(i.e. with no hybridization function) using both CT-INT 
based on Eq. (|22ip (with the 129 interaction parameters) 
and by exactly diagonalizing the problem. Fig. US] shows 
that the CT-INT expansion reproduces the exact result 
and that the 'non-diagonal' terms in the interaction are 
important. 

Gorelov et al. then computed the local density of states 
of a Co atom in bulk Cu (far from a surface). For this 
case the sign problem, while present, is not severe for the 
temperatures studied (T « 12007^). Results are shown 
in the left panel of Fig. |46l The non-diagonal interactions 
have an important effect on the line shape. (Note that 
T w 1200i^ is well above the Kondo temperature, so no 
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0.25 



peak is evident at the Fermi surface). 

Finally, Gorelov et al. consider a Co impurity at a 
surface (right panel of Fig. |46|) . Here the relatively large 
non-diagonal elements of the bath Green's function lead 
to a serious sign problem. To make a simulation on 
the surface feasible, Gorelov et al. in effect restricted 
the sampling to a constant-sign subset of configuration 
space, by only allowing updates that d i d not change the 
fermionic sign. See Ref. (jGorelov et 120091 ) for further 
details. 



XIV. APPLICATIONS IV: NON-EQUILIBRIUM 
IMPURITY MODELS AND NANOSCALE TRANSPORT 

A. Overview 

CT-QMC methods have been used to study nonequilib- 
rium problems defined on the "Keldysh" two-time con- 
tour. These studies arc still in their early stages and 
we present here a few representative preliminary results 
concerning the current-voltage characteristics of inter- 
acting quantum dots, as well as simulations inspired by 
the newly developing capabilities of performing pump- 
probe experiments on correlated electron compounds and 
"quantum quench" experiments on cold atom systems. 



B. Results: Current- voltage characteristics 

1. Real-time CT-HYB 
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FIG. 47 Points: total current I computed using nonequi- 
librium CT-HYB methods as a function of the bias volt- 
age for a half filled, spinless, phonon-coupled quantum dot 
(Eq. (|133p with U=0) with voltage bias applied symmetri- 
cally, £d = 0, IJ.L,R ~ ±-j at temperature T — ^ at electron- 
phonon coupling strengths A and oscillator frequencies ujo in- 
dicat ed. Lines: result s of an appro ximate analytical calcula- 
tion (|Flensberg| . [200^ ). From Ref. (|Muhlbacher and Rabanil. 
l2008l ). 



lished in ( Schiro and _Fabriziol[200ilSchmidt et aLl . 120081: 



Werner et al . 2009lJ ). Because the expansion must m 
this case be performed for both spin flavors and the de- 
cohering effect of phonons is not included, reaching a 
steady state becomes challenging. Attempts to optimize 
the algorithm by considerin g initial states with dot-lead 
entanglement (ISchirdlioiol) have not led to dramatic im- 
provements. 



The first nonequilibrium applications of the CT-QMC 
technique were to the current-voltage characteristics of 
a quantum dot under a bias voltag e . In their pioneer- 
ing paper, iMiihlbacher and Rabanil (|2008l ) showed that 
the hybridization expansion method could be directly ap- 
plied on the Keldysh contour and that long enough times 
could be reached to permit measurements of steady state 
behavior. They studied a non-interacting dot coupled 
to phonons (essentially the Holstein-Hubbard model, 
Eq. ()133p . with spin neglected and U = 0); representative 
results giving the dependence of the currerent-voltagc 
characteristics on the oscillator frequency and coupling 
strength are presented in Figure 1471 

These calculations start from an initial state in which 
the dot is decoupled from the leads and the calculation 
must build in appropriate dot-lead entanglement. This 
requires a coherence times whic h depends on the physics . 
In the calculations of Ref. (Miihlbacher and Rabani. 



20081 ) convergence was facilitated by decoherence aris- 
ing both from the phonons and from the relatively 
high T which was studied. Results for the interact- 
ing Anderson model (without phonons) have been pub- 



2. Real-time CT-AUX 

In the weak coupling methods, for example the CT- 
AUX algorithm explained in Sec. IVIII.A) one may use 
'interaction quench' methods in which the real-time sim- 
ulation starts from a U = state with dot-lead entan- 
glement. Temperature enters only as a parameter in 
the lead correlators, making it possible to treat arbi- 
trary temperatures, including T = 0. While the pres- 
ence of interactions of course modifies this entanglement, 
it seems that up to interaction strengths oi U w lOF, 
relatively few perturbation orders are required to reach 
steady state. The situation is particularly favorable for 
particle-hole symmetric models with symmetrically ap- 
plied bias, where odd orders of perturbation theory can 
be suppressed. As illustration we present interaction- 
quench results for the time-dependence of the current 
and the current-voltage characteristics of half- filled quan- 
tum dots with symmetrically applied voltage bias (/il = 
-^iR = V/2). 

At time t = 0, the system is non-interacting but sub- 
ject to an applied bias V, so a current loiV) appropriate 
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FIG. 48 Left panel: time evolution of the current for different 
voltage biases {U /T — 6,T — 0). Right panel: current- voltage 
characteristics obtained from long-time limit of the calculated 
currents. The symbols show Monte Carlo data for U/T = 4, 
6, 8 and 10, while the lines correspond to fourth order pertur- 
bation theory. In the initial state, the current is given by the 
steady state current through the non-interactin g dot. At time 
t = , the interaction is turned on. From Ref. (|Werner et all 
I2OIOI '). 




FIG. 49 Momentum distribution n{ek,t) for quenches from 
U/W = to U/W = 0.75 (left panel) and U /W = 1.25 (right 
panel), for Hubbard model with semicircular density of states 
and bandwidth W. t is m easured in units of th e quarter- 
bandwidth W/4. From Ref. l|Eckstein et oLl.l2010l V 



3. Nonequilibrium DMFT 



to the non-interacting model is flowing through the dot. 
At f = 0+ the interaction is turned on and the system 
relaxes into the steady-state configuration appropriate to 
the interacting model. The left panel of Fig. 05] plots the 
time evolution of the current for fixed CZ/F = 6 and sev- 
eral voltage biases. For voltages V/T > 2, even though 
the transient behavior is clearly voltage-dependent, the 
current settles into the new steady state after a time 
tr w 2. However, as the voltage is decreased below 
V/T w 2 the transient time increases. At = F the 
long time limit is attained only for iF > 3 and as V is 
further decreased, the interaction-quench method cannot 
reach the steady state 



As discussed in (Werner et 



201Clf ). in the small-F regime, voltage quench simulations 
are a possible alternative to the interaction quench. In 
the voltage-quench calculations, the time-evolution starts 
from the interacting equilibrium state, which is a good 
starting point for small V. However, because the imagi- 
nary branch of the L-shaped contour must be explicitly 
treated, this approach is restricted to non-zero tempera- 
tures, and is only advantageous for very small voltage. 

The right panel of Fig 05] presents T = results for 
the voltage dependence of the steady state current as 
well as analytic results obtained from fourth order per- 
turbation theory order. The interacting current initially 
rises with the same slope as the non-interacting current, 
and reaches the non-interacting value also in the large- 
voltage limit. At intermediate values of V the effect of 
interactions is to suppress the current (Coulomb block- 
ade) . The results show clearly that at intermediate volt- 
ages the method can access interaction regimes beyond 
the scope of analytical perturbati on theory. In agree - 



ment with conclusions reached in ( Werner et al. I. l2009bl) 



on the basis of (less accurate) hybridization expansion 
results and also with recent noneq uilibrium fun c tiona l 
renormalization group calculations ( Jakobs et al. . 2010l ) 



The real-time CT-QMC methods can also serve as 
impurity solvers in nonequilibrium-DMFT simulations 
of bulk systems. The impurity Hamiltonian of Sec- 
tion |VITLS1 is the impurity problem relevant for the so- 
lution of the one-band Hubbard model. In the DMFT 
context, however, i/bath (Eq- (|15ip ) is a single bath, 
whose parameters are fixed by a self-consistency equation 
which in the non-equi l ibrium context is time depe ndent 
(|Freericks et aLl . l2006t ISchmidt and Monienl . l2002[ ) . 



CT -QMC have been used by lEckstein et oZI (|2009l 
2OIOI ) to study the relaxation dynamics of the half- 
filled Hubbard model after a sudden switching-on of the 
electron repulsion U . The initial state was the non- 
interacting equilibrium state at temperature T = 0, and 
the DMFT selfconsistency assumed a semi-circular den- 
sity of states of bandwidth W ~ 4. The calculation pro- 
duces among other observables the time-evolution of the 
momentum distribution function n(ek,t) which is plot- 
ted in Fig.l3n]for quenches to [/ = 3VF/4 and U = 5W/i. 
Qualitative differences in the relaxation dynamics appear 
as the value of the interaction strength is changed. 



we see that this model does not display a region of neg- 
ative differential resistance. 



It was demonstrated in ( Eckstein et al\ . I2OO9I I2OIOI) 
that for a quench to U = 0.8M^ the momentum distri- 
bution function and double occupancy relax very fast 
(within a time t = 6.4/VF) to the thermal equilibrium re- 
sult compatible with energy conservation. The relevant 
time-scale is in this case easily accessible with real-time 
CT-AUX. Away from this "critical interaction strength" , 
i. e. for quenches to U ^ 0.81^, the system is initially 
trapped in a non-thermal quasi-stationary state, and 
equilibration occurs on much longer time scales. In this 
case the accessible times are not long enough to observe 
the expected thermalization (see right panel of Fig. [49]) . 
Only the initial relaxation into the quasi-stationary state 
and (for U < O.SW^) the initial part of the slow crossover 
towards the thermal equilibrium state are computation- 
ally accessible with present techniques. 
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XV. PROSPECTS AND OPEN ISSUES 

Over the last few years, continuous-time quantum 
Monte Carlo methods for fermionic impurity problems 
have been developed to a high degree. Because the meth- 
ods are based on a diagrammatic expansion, they can 
handle many physically relevant interactions, which were 
not easily treatable by other methods. Also, by construc- 
tion they are free from the time-discretization errors asso- 
ciated with methods previously used for fermion s, base d 
on the Suzuki- Trotter decomposition (jSuzukil . Il97fih . 
Third, a continuous-time formulation provides, in a 
sense, a many-body adaptive grid method for the time 
evolution. These advantages of the continuous-time for- 
mulation enable a decrease, typically by several orders of 
magnitude, in the computational effort required to solve 
a problem of a given complexity, making previously in- 
tractable problems tractable and creating new opportu- 
nities for physics by allowing rapid and routine investi- 
gation of problems which had previously required access 
to supercomputer facilities. 

The methods have become very important to the field 
of correlated electron physics (via th e connection t o single 



( Georges et al . 19961 ) and cluster ( Maier et al . 2005a ) 



dynamical mean field theory) and are having an increas- 
ing impact on nanoscience. However, the first genera- 
tion of results has only begun to explore what is possi- 
ble. We expect that over the next few years the meth- 
ods will be increasingly widely used in dynamical mean 
field computations of correlated electron materials and 
condensed atomic gases, and in studies of the equilib- 
rium and nonequilibrium phenomena arising in the im- 
purity models relevant to nanoscience. The nonequilib- 
rium applications in particular represent an entirely new 
field with many exciting possibilities. Methodological im- 
provements, including combinations of hybridization and 
coupling constant expansions and the further study of 
projected Hilbcrt space methods such as CT-J arc likely 
to be fruitful. We hope that an increasing number of sci- 
entists will take advantage of the opportunities, by ap- 
plying the methods to yet wider classes of problems and 
by developing them further. 

We conclude our discussion by summarizing what we 
perceive to be the strengths and weaknesses of the dif- 
ferent CT-QMC methods, and suggesting some issues 
that may warrant further attention. The fundamental 
issues for any algorithm are the scaling with tempera- 
ture, interaction strength, and system size. In addition, 
for fermions, one must consider the sign problem. Table 
[l] summarizes what we know about these scalings. 

The hybridization expansion algorithm, CT-HYB, di- 
agonalizes the local Hamiltonian and expands in the 
impurity-bath hybridization. The principal advantage of 
this approach is that instantaneous (Hamiltonian) inter- 
actions of essentially arbitrary strength and functional 
form can be handled (retarded interactions can be con- 



veniently treated only in special, but physically relevant 
cases such as the screened density-density interaction). 
The hybridization expansion appears to suffer from a se- 
vere sign problem if the hybridization function does not 
commute with the one-body part of the local Hamilto- 
nian, and this limits its use in the most general contexts. 
It appears to be most useful for the single-site dynam- 
ical mean field theory of materials with partly filled d 
and / shells, where its ability to treat the full complexity 
of general multiplet interactions is unmatched and the 
point symmetry ensures that the local Hamiltonian and 
hybridization functions commute. 

The fundamental computational bottlenecks of the hy- 
bridization method are the need to manipulate matrices 
whose size is set by the dimension of the full fermionic 
Hilbert space of the impurity Hamiltonian and the need 
to compute determinants of hybridization matrices of a 
size linearly growing with /3. The computational bur- 
den grows exponentially with the size of the fermionic 
problem and as the cube of the inverse temperature, and 
the system-size constraint is therefore more severe. For 
a model of N spin degenerate orbitals the Hilbert space 
size is 4^. At present, 5 orbital models are accessible 
with large scale computing resources. For larger systems 
a straightforward approach is not feasible yet without 
truncation. The accuracy of truncation schemes is not 
yet established. Of course, in special cases block diago- 
nalization is possible so the full Hilbert space need not be 
treated. In the most favorable case, the local Green func- 
tion, hybridization matrix and interaction may all be di- 
agonalized in the same single-particle occupation number 
basis (this occurs in the A^-orbital impurity model with 
density-density interactions, if each orbital hybridizes 
with a different bath) and the segment representation 
of the hybridization expansion may be used. In this 
case there is no sign problem and the cost is linear in 
the number of orbitals and cubic in the inverse temper- 
ature. Thus if a segment representation exists, it should 
be used. Unfortunately, in most problems of physical in- 
terest either hybridizations or interactions entangle the 
different single-particle basis states, and a general ma- 
trix formulation is required. In this case the exponen- 
tial scaling associated with the Hilbert space size is the 
crucial constraint, and an important open problem con- 
cerns the degree to which the Hilbert space can be block 
diagonalizcd or truncated. Haule pioneered the use of 
symme t ry-ba sed block diagonalization and of truncation 
(jHauld . 120071 ). An alternative approach based on sparse 
matrix-vector instead of dense matrix-matrix multiplica- 
tion is the Krylov technique discussed in Sec. IV. Dl Fur- 
ther research along these and related lines appears to be 
worthwhile. 

The interaction expansions CT-INT and CT-AUX are 
based on an expansion about the free-fermion limit. The 
computational burden therefore increases with the inter- 
action strength, as well as with inverse temperature and 
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Scaling / Algorithm 


CT-INT 


CT-AUX 


CT-HYB (segment) 


CT-HYB (matrix) 


diagonal hybridization 
non-diagonal hyb. 
diagonal interaction 
general Uijki 


NiPUf 
(NPU)^, sign prob. 
(N/SU)^, sign prob. 
{N'^puf, sign prob. 


N{pUf 
{NpUf, sign prob. 
(NPU)^, sign prob. 
n/a 


NP'-^ 
{NP)''^, sign prob. 
(NP)^, sign prob. 

n/a 


ae'^P^ +bNP^,a:» b 
ae'^P^ + b{Npf, a:>b, sign prob. 
ae'^P'^ + b(Np)^,a > b, sign prob. 
ae^ P'^ + b{Npf,a > b, sign prob. 



TABLE I Summary of scaling and sign metrics in the equilibrium case for most widely studied continuous-time quantum 
Monte Carlo methods. CT-INT, CT-AUX, CT-HYB refer to the interaction expansion (Sec.|Ill|, auxihary field (Sec.|lVll and 
hybridization (Sec. |V]) expansion algorithms respectively. "Segment" refers to the case of the hybridization interaction where 
the hybridization function, local Hamiltonian and interaction are all diagonal in the same basis (|V.B|) . while "matrix" refers 
to the general implementation in Sec. IV. CI We distinguish Green functions which can be diagonalized by one single canonical 
transformation from general Green functions where the hybridization function, local Hamiltonian and self energy do not all 
commute, and we distinguish interactions such as the Hubbard U which are diagonal in an appropriate single-particle occupation 
number basis from those such as spin exchange and "pair hopping" which cannot be diagonalized. sign prob. indicates the 
possibility of the presence of a fermionic sign problem. 



the system size, making it difficult to access the very 
strong coupling regime. However, the scaling with sys- 
tem size is power-law, rather than exponential, so that 
these methods are the only ones feasible when many or- 
bitals or many sites are important to the physics. At 
present, a lack of good auxiliary field decompositions 
means that the CT-AUX method can only be used for 
models with density-density interactions. Its main ap- 
plication has been to cluster dynamical mean field stud- 
ies of the Hubbard model. A natural subject for fur- 
ther investigations is the application of the method to 
wider classes of models, including more general (but still 
density-density) interactions. The CT-INT method is 
equivalent to the CT-AUX method for Hubbard like in- 
teractions (although the present CT-AUX implementa- 
tions appear to be more efRcient), and is applicable to 
models with general (non density-density) interactions. 
However, sign problems occur and grow in severity as 
the complexity of the interaction increases. 

Our experience in the nonequilibrium context is that 
in a particle-hole symmetric model, the CT-INT and CT- 
AUX methods are to be preferred over the hybridization 
methods because odd pertu rbation orders can be sup- 
pressed I Werner et al. . 2010f ). resulting in a less severe 
sign problem and longer accessible times. 

There are two limitations associated with the CT-INT 
and CT-AUX methods. One issue for more realistic mod- 
els with more complicated interactions is the need to 
make a multiple expansion in all components of the inter- 
action. This is not a serious issue as long as no sign prob- 
lem is encountered. The more fundamental limitation 
is the sign problem, which can arise in cluster dynam- 
ical mean field calculations from the presence of physi- 
cal (real-space) fermionic loops or more generally from 
non-commutativity of operators appearing in the impu- 
rity model, due for example to exchange interactions or 
to hybridization functions which cannot be diagonalized 
by a single (time-independent) basis change. Sign prob- 
lems are in general dependent on the choice of basis and 



further exploration of different representations of the in- 
teraction and the Green function, especially in the case 
of non-diagonal interaction, may be worthwhile. 
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